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Preface 



Since 2008 this mathematics lecture is offered for the master courses computer science, 
mechatronics and electrical engineering. After a repetition of basic linear algebra, computer 
algebra and calculus, we will treat numerical calculus, statistics and function approximation, 
which are the most important mathematics basic topics for engineers. 

We also provide an introduction to Computer Algebra. Mathematica, Matlab and Octave 
are powerful tools for the Exercises. Event though we favour the open source tool Octave, 
the student is free to choose either one of the three. 

We are looking forward to work with interesting semesters with many motivated and eager 
students who want to climb up the steep, high and fascinating mountain of engineering 
mathematics together with us. I assure you that we will do our best to guide you through 
the sometimes wild, rough and challenging nature of mathematics. I also assure you that all 
your efforts and your endurance in working on the exercises during nights and weekends will 
pay off as good marks and most importantly as a lot of fun. 

Even though we repeat some undergraduate linear algebra and calculus, the failure rate 
in the exams is very high, in particular among the foreign students. As a consequence, we 
strongly recommend all our students to repeat undergraduate linear algebra such as operation 
on matrices like solution of linear systems, singularity of matrices, inversion, eigenvalue 
problems, row-, column- and nullspaces. You also should bring decent knowledge of one- 
dimensional and multidimensional calculus, e.g. differentiation and integration in one and 
many variables, convergence of sequences and series and finding extrema with constraints of 
multivariate functions. And basic statistics is also required. To summarize: If you are not 
able to solve problems (not only know the terms) in these fields, you have very 
little chances to successfully finish this course. 

History of this Course 

The first version of this script was created in the winter semester 95/96. I had included in 
this lecture only Numerics, although I wanted to cover initially Discrete Mathematics too, 
which is very important for computer scientists. If you want to cover both in a lecture of 
three semester week hours, it can happen only superficially. Therefore I decided to focus like 
my colleagues on Numerics. Only then it is possible to impart profound knowledge. 

From Numerical Calculus besides the basics, systems of linear equations, various interpola- 
tion methods, function approximation, and the solution of nonlinear equations will be pre- 
sented. An excursion into applied research follows, where e.g. in the field of benchmarking 
of Microprocessors, mathematics (functional equations) is influencing directly the practice 
of computer scientists. 

In summer 1998 a chapter about Statistics was added, because of the weak coverage at 
our University till then. In the winter semester 1999/2000, the layout and structure were 
improved, as well some mistakes have been removed. 

In the context of changes in the summer semester 2002 in the curriculum of Applied Computer 
science, statistics was shifted, because of the general relevance for all students, into the lecture 
Mathematics 2. Instead of Statistics, contents should be included, which are specifically 
relevant for computer scientists. The generation and verification of random numbers is an 
important topic, which is finally also covered. 

Since summer 2008, this lecture is only offered to Master (Computer Science) students. 
Therefore the chapter about random numbers was extended. Maybe other contents will be 
included in the lecture. For some topics original literature will be handed out, then student 




have to prepare the material by themselves. 

To the winter semester 2010/11 the lecture has now been completely revised, restructured 
and some important sections added such as radial basis functions, Gaussian processes and 
statistics and probability. These changes become necessary with the step from Diploma 
to Master. I want to thank Markus Schneider and Haitham Bou Ammar who helped me 
improve the lecture. 

To the winter semester 2010/11 the precourse will be integrated in the lecture in order to 
give the students more time to work on the exercises. Thus, the volume of lecture grows 
from 6 SWS to 8 SWS and we will now split it into two lectures of 4 SWS each. 

In the winter semester 2012/13 we go back to a one semester schedule with 6 hours per week 
for computer science and mechatronics students. Electrical engineering students will only go 
for four hours, covering chapters one to six. 

Wolfgang Ertcl 
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Chapter 1 

Linear Algebra 



1.1 Video Lectures 

We use the excellent video lectures from G. Strang, the author of [ ], available from: http : / / 
ocw.mit . edu/courses/mathematics/18-06-linear-algebra-spring-2010. In particular 
we show the following lectures: 



Lee # 


Topics 


1 


The geometry of linear equations (lecture 01) 


2 


Transposes, Permutations, Spaces R n (lecture 05) 


3 


Column Space and Nullspace (lecture 06) 


4 


Solving Ax = 0: Pivot Variables, Special Solutions (lecture 07) 


5 


Independence, Basis, and Dimension (lecture 09) 


6 


The Four Fundamental Subspaces (lecture 10) 


7 


Orthogonal Vectors and Subspaces (lecture 14) 


8 


Properties of Determinants (lecture 18) 


9 


Determinant Formulas and Cofactors (lecture 19) 


10 


Cramer’s rule, inverse matrix, and volume (lecture 20) 


11 


Eigenvalues and Eigenvectors (lecture 21) 


12 


Symmetric Matrices and Positive Definiteness (lecture 25) 


13 


Linear Transformations and Their Matrices (lecture 30) 



1.2 Exercises 

Exercise 1.1 Solve the nonsingular triangular system 

u + v + w = bi (1.1) 

v + w = b 2 (1.2) 

w = b 3 (1.3) 

Show that your solution gives a combination of the columns that equals the column on the 
right. 

Exercise 1.2 Explain why the system 

u + v + w — 2 
u + 2v + 3w = 1 
v + 2w = 0 



(1.4) 

(1.5) 

( 1 . 6 ) 
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1 Linear Algebra 



is singular, by finding a combination of the three equations that adds up to 0 = 1. What 
value should replace the last zero on the right side, to allow the equations to have solutions, 
and what is one of the solutions? 

Inverses and Transposes 

Exercise 1.3 Which properties of a matrix A are preserved by its inverse (assuming A 
exists)? 

(1) A is triangular 

(2) A is symmetric 

(3) A is tridiagonal 

(4) all entries are whole numbers 

(5) all entries are fractions (including whole numbers like |) 

Exercise 1.4 

a) How many entries can be chosen independently, in a symmetric matrix of order n? 

b) How many entries can be chosen independently, in a skew-symmetric matrix of order n? 

Permutations and Elimination 

Exercise 1.5 

a) Find a square 3x3 matrix P, that multiplied from left to any 3 x m matrix A exchanges 
rows 1 and 2. 

b) Find a square n x n matrix P, that multiplied from left to any n x m matrix A exchanges 
rows i and j. 

Exercise 1.6 A permutation is a bijective mapping from a finite set onto itself. Applied 
to vectors of length n, a permutation arbitrarily changes the order of the vector compo- 
nents. The word “ANGSTBUDE” is a permutation of “BUNDESTAG”. An example of a 
permutation on vectors of length 5 can be described by 

(3, 2, 1,5, 4). 

This means component 3 moves to position 1, component 2 stays where it was, component 
1 moves to position 3, component 5 moves to position 4 and component 4 moves to position 
5. 

a) Give a 5 x 5 matrix P that implements this permutation. 

b) How can we come from a permutation matrix to its inverse? 

Exercise 1.7 

a) Find a 3 x 3 matrix E, that multiplied from left to any 3 xm matrix A adds 5 times row 
2 to row 1. 

b) Describe anx n matrix E, that multiplied from left to any n x m matrix A adds k times 
row i to row j. 

c) Based on the above answers, prove that the elimination process of a matrix can be realized 
by successive multiplication with matrices from left. 




1.2 Exercises 
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Column Spaces and NullSpaces 

Exercise 1.8 Which of the following subsets of R 3 are actually subspaces? 

a) The plane of vectors with first component b\ = 0. 

b) The plane of vectors b with b\ = 1. 

c) The vectors b with b\b 2 — 0 (this is the union of two subspaces, the plane b\ — 0 and the 
plane b 2 = 0). 

d) The solitary vector b = (0, 0, 0). 

e) All combinations of two given vectors x = (1, 1, 0) and y = (2, 0, 1). 

f) The vectors (&i, b 2 , b 3 ) that satisfy b 3 — b- 2 + 3&i = 0. 

Exercise 1.9 Let P be the plane in 3-space with equation x + 2y + z = 6. What is the 
equation of the plane Pq through the origin parallel to P? Are P and Po subspaces of P 3 ? 

Exercise 1.10 Which descriptions are correct? The solutions x of 



Ax 



'1 1 1' 




Xi 




O' 


10 2 




X2 


— 


0 


L J 




P3_ 







form a plane, line, point, subspace, nullspace of A, column space of A. 



(1.7) 



Ax = 0 and Pivot Variables 

Exercise 1.11 For the matrix 



0 14 0 
0 2 8 0 



( 1 . 8 ) 



determine the echelon form U, the basic variables, the free variables, and the general solution 
to Ax = 0. Then apply elimination to Ax = b, with components b\ and b 2 on the right side; 
find the conditions for Ax = b to be consistent (that is, to have a solution) and find the 
general solution in the same form as Equation (3). What is the rank of A? 

Exercise 1.12 Write the general solution to 







u 






'1 2 2 








' 1 ' 


2 4 5 




V 


— 


4 


L J 




w 







(1.9) 



as the sum of a particular solution to Ax = b and the general solution to Ax = 0, as in (3). 
Exercise 1.13 Find the value of c which makes it possible to solve 



u + v + 2w — 2 
2u + 3v — w — 5 
3u + Av + w — c 



( 1 . 10 ) 

( 1 . 11 ) 

( 1 . 12 ) 
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Solving Ax = b 

Exercise 1.14 Is it true that if v\, v 2 , v 3 are linearly independent, that also the vectors 
W\ = Vi + v 2 , w 2 = i’i + v 3 , w 3 = v 2 + v 3 are linearly independent? (Hint: Assume some 
combination c.\Wi + c 2 w 2 + c 3 w 3 = 0, and find which c t are possible.) 

Exercise 1.15 Find a counterexample to the following statement: If Vi,v 2 ,v 3 ,v 4 is a basis 
for the vector space i? 4 , and if W is a subspace, then some subset of the As is a basis for 
W. 

Exercise 1.16 Suppose V is known to have dimension k. Prove that 

a) any k independent vectors in V form a basis; 

b) any k vectors that span V form a basis. 

In other words, if the number of vectors is known to be right, either of the two properties of 
a basis implies the other. 

Exercise 1.17 Prove that if V and W are three-dimensional subspaces of R 5 , then V and 
W must have a nonzero vector in common. Hint: Start with bases of the two subspaces, 
making six vectors in all. 



The Four Fundamental Subspaces 



Exercise 1.18 Find the dimension and construct a basis for the four subspaces associated 
with each of the matrices 



A 



0 14 0 
0 2 8 0 



and U 



0 14 0 
0 0 0 0 



(1.13) 



Exercise 1.19 If the product of two matrices is the zero matrix, AB = 0, show that the 
column space of B is contained in the nullspace of A. (Also the row space of A is the left 
nullspaee of B , since each row of A multiplies B to give a zero row.) 

Exercise 1.20 Explain why Ax = b is solvable if and only if rank A = rank A', where A' 
is formed from A by adding b as an extra column. Hint: The rank is the dimension of the 
column space; when does adding an extra column leave the dimension unchanged? 

Exercise 1.21 Suppose A is an m by n matrix of rank r. Under what conditions on those 
numbers does 

a) A have a two-sided inverse: AA^ 1 = A -1 A = /? 

b) Ax = b have infinitely many solutions for every 6? 

Exercise 1.22 If Ax = 0 has a nonzero solution, show that A T y = / fails to be solvable for 
some right sides /. Construct an example of A and /. 



Orthogonality 

Exercise 1.23 In R 3 find all vectors that are orthogonal to (1, 1, 1) and (1, -1, 0). Produce 
from these vectors a mutually orthogonal system of unit vectors (an orthogonal system) in 
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R 3 . 

Exercise 1.24 Show that a: — y is orthogonal to x + y if and only if ||a;|| = ||y||. 

Exercise 1.25 Let P be the plane (not a subspace) in 3-space with equation x + 2y — z = 6. 
Find the equation of a plane P' parallel to P but going through the origin. Find also a 
vector perpendicular to those planes. What matrix has the plane P' as its nullspace, and 
what matrix hast P' as its row space? 



Projections 

Exercise 1.26 Suppose A is the 4x4 identity matrix with its last column removed. A 
is 4 x 3. Project b = (1,2, 3, 4) onto the column space of A. What shape is the projection 
matrix P and what is PI 



Determinants 



Exercise 1.27 How are det(2H), det(— A), and det(H 2 ) related to det A, when A is n by 
n? 

Exercise 1.28 Find the determinants of: 
a) a rank one matrix 



A = 




-1 2 ] 



b) the upper triangular matrix 



U = 



4 4 8 8 
0 12 2 
0 0 2 6 
0 0 0 2 



(1.14) 



(1.15) 



c) the lower triangular matrix f/ T ; 

d) the inverse matrix I/ -1 ; 

e) the “reverse-triangular” matrix that results from row exchanges, 



M 



0 0 0 2 
0 0 2 6 
0 12 2 
4 4 8 8 



(1.16) 



Exercise 1.29 If every row of A adds to zero prove that det A — 0. If every row adds to 1 
prove that det (A — /) = 0. Show by example that this does not imply det A = 1. 
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Properties of Determinants 

Exercise 1.30 Suppose A n is the n by n tridiagonal matrix with l’s everywhere on the 
three diagonals: 



A\ — 1 , A 2 



1 1 
1 1 ’ 



A 3 



1 1 0 
111 
Oil 



(1.17) 



Let D n be the determinant of A n ; we want to hnd it. 

a) Expand in cofactors along the first row of A n to show that D n = D n _i — D n _ 2 . 

b) Starting from D\ — 1 and D 2 = 0 hnd D 3 , D 4 , ..., D s . By noticing how these numbers 
cycle around (with what period?) hnd -D 10 oo- 

Exercise 1.31 Explain why a 5 by 5 matrix with a 3 by 3 zero submatrix is sure to be a 
singular (regardless of the 16 nonzeros marked by ads): 





X 


X 


X 


X 


X 






X 


X 


X 


X 


X 




the determinant of A = 


0 


0 


0 


X 


X 


is zero 




0 


0 


0 


X 


X 






0 


0 


0 


X 


X 





(1.18) 



Exercise 1.32 If A is m by n and B is n by m, show that 
det = 



0 A 
-B I 



det AB. f Hint: Postmultiply by 



/ 0 
B I 



(1.19) 



Do an example with m < n and an example with m > n. Why does the second example 
have det AB = 0? 



Cramers’ rule 

Exercise 1.33 The determinant is a linear function of the column 1. It is zero if two 
columns are equal. When b = Ax = X\d\ + x 2 a 2 + 0:303 goes into the hrst column of A, then 
the determinant of this matrix B\ is 

| b ci 2 O3I — |aqcq T x 2 cl 2 T 0:303 a 2 03] — o:i|oi cl 2 03] — o:idetA. 

a) What formula for oq comes from left side = right side? 

b) What steps lead to the middle equation? 

Eigenvalues and Eigenvectors 

Exercise 1.34 Suppose that A is an eigenvalue of A, and x is its eigenvector: Ax = Ao:. 

a) Show that this same x is an eigenvector of B = A — 7/, and hnd the eigenvalue. 

b) Assuming A 7^ 0, show that x is also an eigenvector of A -1 and hnd the eigenvalue. 




1.2 Exercises 
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Exercise 1.35 Show that the determinant equals the product of the eigenvalues by imagining 
that the characteristic polynomial is factored into 



det(A - A/) = (A! - A)(A 2 - A) • • • (A n - A) 



( 1 . 20 ) 



and making a clever choice of A. 

Exercise 1.36 Show that the trace equals the sum of the eigenvalues, in two steps. First, 
End the coefficient of (— A) n_1 on the right side of (15). Next, look for all the terms in 



det(d — A I) = det 



an — A 
021 



®12 

®22 A 



Oln 
®2 n 



Onl 0»i2 



-A 



( 1 . 21 ) 



which involve (— A) n 1 . Explain why they all come from the product down the main diagonal, 
and find the coefficient of (— A) n_1 on the left side of (15). Compare. 



Diagonalization of Matrices 



Exercise 1.37 Factor the following matrices into SAS 1 



A 



1 1 
1 1 



and A 



2 1 
0 0 ' 



( 1 . 22 ) 



Exercise 1.38 Suppose A = uv T is a column times a row (a rank-one matrix). 

a) By multiplying A times u show that u is an eigenvector. What is A? 

b) What are the other eigenvalues (and why)? 

c) Compute traced) = v T u in two ways, from the sum on the diagonal and the sum of A’s. 

Exercise 1.39 If A is diagonalizable, show that the determinant of A = SAS~ l is the 
product of the eigenvalues. 



Symmetric and Positive Semi-Definite Matrices 

Exercise 1.40 If A — QAQ T is symmetric positive definite, then R = Q\fAQ T is its 
symmetric positive definite square root. Why does R have real eigenvalues? Compute R and 
verify R 2 = A for 



A 



2 1 
1 2 



and, A 



10 -6 

-6 10 



(1.23) 



Exercise 1.41 If A is symmetric positive definite and C is nonsingular, prove that B = 
C T AC is also symmetric positive definite. 

Exercise 1.42 If A is positive definite and an is increased, prove from cofactors that the 
determinant is increased. Show by example that this can fail if A is indefinite. 
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Linear Transformation 

Exercise 1.43 Suppose a linear mapping T transforms (1, 1) to (2, 2) and (2, 0) to (0, 0). 
Find T(v): 

(a) v = (2,2) ( b ) v = (3, 1) (c) v = (—1, 1) (d) v = ( a,b ) 

Exercise 1.44 Suppose T is reflection across the 45° line, and S is reflection across the y 
axis. If v — (2, 1) then T(v) = (1,2). Find S(T(v )) and T{S{v)). This shows that generally 
ST ± TS. 

Exercise 1.45 Suppose we have two bases Vi, . . . , v n and wi,...,w n for R n . If a vector has 
coefficients b, in one basis and c, in the other basis, what is the change of basis matrix in 
b — Me! Start from 

hvi + ... + b n v n = Vb = ciWi + ... + c n w n = Wc. (1-24) 

Your answer represents T(v) = v with input basis of v’s and output basis of w’ s. Because of 
different bases, the matrix is not I. 




Chapter 2 

Computer Algebra 



I 



Definition 2.1 Computer Algebra = Symbol Processing + Numerics + Graphics 



I Definition 2.2 Symbol Processing is calculating with symbols (variables, constants, 
function symbols), as in Mathematics lectures. 

Advantages of Symbol Processing: 

• often considerably less computational effort compared to numerics. 

• symbolic results (for further calculations), proofs in the strict manner possible. 

Disadvantages of Symbol Processing: 

• often there is no symbolic (closed form) solution, then Numerics will be applied, 
e.g.: 

— Calculation of Antiderivatives 
— Solving Nonlinear Equations like: (e x = sinx ) 

Example 2.1 



1. symbolic: 



lim 

x — »oo 



111 x 

x + 1 



/ 



= ? 



(. asymptotic behavior ) 



lnx V ^(x + 1) — \nx 1 hire 

x + 1 ) (x + l) 2 (x + l)x (x + l) 2 



/ In a; V 1 
\x + 1 J x 2 



In x In x 




2. numeric: 



lim f(x) =? 
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Example 2.2 Numerical solution of x 2 = 5 



x 



2 




X + 



5 

x 



iteration: 





5 

x 



1 ( 5 
Xn + 1 0 I *^n T 

X r 



n 


X n 


0 


2 <— Startwert 


1 


2.25 


2 


2.236111 


3 


2.23606798 


4 


2.23606798 



=*► y/5 = 2.23606798 ± 10" 8 



(approximate solution) 



2.1 Symbol Processing on the Computer 

Example 2.3 Symbolic Computing with natural numbers: 
Calculation rules, i.e. Axioms necessary. =>- Peano Axioms e.g.: 



\/x, y,z : x + y 


= y + x 


(2.1) 


x + 0 


= X 


(2,2) 


(x + y) + z 


= x + (y + z) 


(2.3) 



Ont of these rules, e.g. 0 + x = x can be deduced: 



0 + x 



= x + 0 

(A) 




x 



Implementation of symbol processing on the computer by ’’Term Rewriting”. 
Example 2.4 (Real Numbers) Chain Rule for Differentiation: 

[. f{d{x ))]' => f'(g(x))g\x) 

sin(lnx + 2)' = cos (In x + 2) — 

x 

Computer: (Pattern matching) 

sm(Plus(ln x, 2))' = cos(Plus(ln x, 2))Plus' (In x, 2) 
sin(Plus(\nx, 2))' = cos(Plus(ln x,2))Plus(ln x,2') 



2.2 Short Introduction to Mathematica 
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sm(Plus(\nx } 2))' = cos(Plus(\nx, 2))Plus 




sm(Plus(\nx, 2))' = cos(Plus(\nx,2)) 



1 



x 




Effective systems: 

• Mathematica (S. Wolfram & Co.) 

• Maple (ETH Zurich + Univ. Waterloo, Kanada) 

2.2 Short Introduction to Mathematica 

Resources: 

• Library: Mathematica Handbook (Wolfram) 

• Mathematica Documentation Online: http://reference.wolfram.com 

• http : //www .hs-weingarten. de/~ertel/vorlesungen/mae/links .html 

2. 2.0.1 Some examples as jump start 

In [1] := 3 + 2~3 

Out [1] = 11 
In [2] : = Sqrt [10] 

0ut[2]= Sqrt [10] 

In[3]:= N [Sqrt [10]] 

Out [3] = 3.16228 

In [4] := N [Sqrt [10] ,60] 

Out [4] = 3 . 1622776601683793319988935444327185337195551393252168268575 
In[5]:= Integrate [x~2 Sin[x]~2, x] 



3 



2 



4 x - 6 x Cos [2 x] + 3 Sin [2 x] - 6 x Sin [2 x] 



Out [5] = 



24 



In [7] := D [“/ , x] 



2 



2 



12 x - 12 x Cos [2 x] 



Out [7] = 



24 
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In[8]:= Simplify [7,] 

2 2 
Out [8] = x Sin[x] 

In[9]:= Series [Exp [x] , {x,0,6}] 

2 3 4 5 6 

x x x x x 7 

Out [9] = 1 + x + — + — + — + + + 0 [x] 

2 6 24 120 720 

In[10]:= Expand[(x + 2)~3 + ((x - 5)~2 (x + y)~2)~3] 

2 3 6 7 8 9 

Out [10]= 8 + 12 x + 6 x +x + 15625 x - 18750 x + 9375 x - 2500 x + 

10 11 12 5 6 7 

> 375 x - 30 x + x + 93750 x y - 112500 x y + 56250 x y - 

8 9 10 11 4 2 

> 15000 x y + 2250 x y - 180 x y + 6 x y + 234375 x y - 

52 62 72 82 92 

> 281250 x y + 140625 x y - 37500 x y + 5625 x y 450 x y + 

10 2 33 43 53 63 

> 15 x y + 312500 x y - 375000 x y + 187500 x y - 50000 x y + 

73 83 93 24 34 

> 7500 x y - 600 x y + 20 x y + 234375 x y - 281250 x y + 

44 54 64 74 84 

> 140625 x y - 37500 x y + 5625 x y - 450 x y + 15 x y + 

5 25 35 45 55 

> 93750 x y - 112500 x y + 56250 x y - 15000 x y + 2250 x y 

6575 6 6 26 36 

> 180 x y + 6 x y + 15625 y - 18750 x y + 9375 x y - 2500 x y + 

4 6 5 6 6 6 

> 375 x y - 30 x y + x y 

In[ll]:= Factor[°/„] 

2 3 4 2 3 2 

Out [11]= (2 + x + 25 x -lOx +x + 50 x y - 20 x y+2x y + 25 y - 

222 23456 

> 10 xy +x y ) (4 + 4 x - 49 x -5x + 633 x - 501 x + 150 x 

7 8 2 3 4 5 

> 20 x + x - 100 x y - 10 x y + 2516 x y - 2002 x y + 600 x y - 

6 7 2 2 22 32 

> 80 x y+4x y - 50 y -5xy + 3758 x y - 3001 x y + 

42 52 62 3 23 33 




2.2 Short Introduction to Mathematica 
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> 900 x y - 120 x y + 6 x y + 2500 x y - 2000 x y + 600 x y 

43 53 4 4 24 3444 

> 80xy+4xy + 625 y - 500 x y + 150 x y - 20 x y + x y) 




In [12]:= InputForm [°/ 0 7] 

Out [12] //InputForm= (12*x~2 - 12*x~2*Cos [2*x] ) /24 
In[20]:= Plot [Sin[l/x] , {x, 0.01, Pi}] 

Out [20]= -Graphics- 

In [42] : = Plot3D[x~2 + y~2, {x,-l,l>, {y,0,l}] 

Out [42]= -Surf aceGraphics- 

In[43]:= f[x_,y_] := Sin[(x~2 + y~3)] / (x~2 + y~2) 

In [44] := f [2,3] 

Sin [31] 

Out [44]= 

13 

In[45]:= ContourPlot [x~2 + y~2, {x,-l,l}, {y,-l,l}] 

Out [45]= -Surf aceGraphics- 

In[46]:= Plot3D [f [x,y] , {x,-Pi,Pi}, {y,-Pi,Pi}, PlotPoints -> 30, 

PlotLabel -> "Sin[(x~2 + y~3)] / (x~2 + y~2)", PlotRange -> {-1,1}] 

Out [46]= -Surf aceGraphics- 

Sin[(x A 2 + y A 3) ] / (x A 2 + y A 2) 

Sin[(x / '2 + y^3) ] / (x A 2 + y A 2) 



In [47]:= ContourPlot [f [x,y] , {x,-2,2}, {y,-2,2}, PlotPoints -> 30, 
ContourSmoothing -> True, ContourShading -> False, 
PlotLabel -> "Sin[(x~2 + y~3)] / (x~2 + y~2)"] 



Out [47]= -ContourGraphics- 
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In [52]:= Table [x~2, {x, 1, 10}] 

Out [52] = {1, 4, 9, 16, 25, 36, 49, 64, 81, 100} 

In [53] : = Table[{n, n~2}, {n, 2, 20}] 

Out [53]= {{2, 4}, {3, 9}, {4, 16}, {5, 25}, {6, 36}, {7, 49}, {8, 64}, 

> {9, 81}, {10, 100}, {11, 121}, {12, 144}, {13, 169}, {14, 196}, 

> {15, 225}, {16, 256}, {17, 289}, {18, 324}, {19, 361}, {20, 400}} 

In [54] : = Transpose [%] 

Out [54] = {{2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 

> 20}, {4, 9, 16, 25, 36, 49, 64, 81, 100, 121, 144, 169, 196, 225, 256, 

> 289, 324, 361, 400}} 

In[60]:= ListPlot [Table [Random [] +Sin [x/10] , {x, 0,100}]] 

Out [60]= -Graphics- 




In [61]:= x = Table [i, {i,l,6}] 

Out [61] = {1, 2, 3, 4, 5, 6} 

In [62]:= A = Table [i*j , {i,l,5}, {j,l,6}] 

Out [62] = {{1, 2, 3, 4, 5, 6}, {2, 4, 6, 8, 10, 12}, {3, 6, 9, 12, 15, 18}, 

> {4, 8, 12, 16, 20, 24}, {5, 10, 15, 20, 25, 30}} 

In [63] := A.x 

Out [63]= {91, 182, 273, 364, 455} 

In [64] := x.x 
Out [64]= 91 

In [71]:= B = A . Transpose [A] 

Out [71]= {{91, 182, 273, 364, 455}, {182, 364, 546, 728, 910}, 

> {273, 546, 819, 1092, 1365}, {364, 728, 1092, 1456, 1820}, 

> {455, 910, 1365, 1820, 2275}} 

In [72]:= B - IdentityMatrix [5] 

Out [72]= {{90, 182, 273, 364, 455}, {182, 363, 546, 728, 910}, 
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> {273, 546, 818, 1092, 1365}, {364, 728, 1092, 1455, 1820}, 

> {455, 910, 1365, 1820, 2274}} 



7. 


last command 


y o n 


nth last command 


?f 


help for function f 


??f 


more help for f 


1 — 1 
>> 
1 1 

w 

o 

0 

* 

CN 

< 

II 

i — i 
1 

>> 

1 

W 

1 i 

M-l 


define function f(x, y) 


a = 5 


assign a constant to variable a 


f = x"2 * Cos [y] 


assign an expression to variable / 


(/is only a placeho 


der for the expression, not a function!) 


D [f [x,y] ,x] 


derivative of / with respect to x 


Integrate [f [x , y] , y] 


antiderivative of / with respect to x 


Simplify [expr] 


simplifies an expression 


Expand [expr] 


expand an expression 


Solve [f [x] ==g [x]] 


solves an equation 


"C 


cancel 


Input Form [Expr] 


converts into mathematica input form 


TeXForm [Expr] 


converts into the PTRXform 


F ortr anF orm [Expr] 


converts into the Fortran form 


CForm [Expr] 


converts into the C form 


ReadList ["daten.dat" , {Number, Number}] 


reads 2-column table from file 


Table [f [n] , {n, n_min, n_max}] 


generates a list f(n min ), . . . , f{n max ) 


Plot [f [x] , {x , x_min , x_max}] 


generates a plot of / 


ListPlot [Liste] 


plots a list 


Plot3D [f [x , y] , {x , x_min , x_max} , {y , y_min , y_max}] 


generates a three-dim. plot of / 


ContourPlot [f [x , y] , {x , x_min , x_max} , {y , y_min , y_max}] 


generates a contour plot of / 


Display ["Dateiname" "EPS"] 


write to the file in PostScript format 



Table 2.2: Mathematica - some inportant commands 



Example 2.5 (Calculation of Square Roots) 

(*********** square root iterative **************) 
sqrt [a_,genauigk_] := Module [{x, xn, delta, n}. 

For [{delta=9999999 ; n = 1; x=a}, delta > 10" (-accuracy) , n++, 
xn = x; 

x = l/2(x + a/x) ; 
delta = Abs [x - xn] ; 

Print ["n = ", n, " x = ", N [x,2*accuracy] , " delta = ", N [delta]]; 

]; 

N [x , genauigk] 

] 

sqrt::usage = "sqrt[a,n] computes the square root of a to n digits." 



Table [sqrt [i , 10] , {i , 1 , 20}] 
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(*********** square root recursive **************) 
x[n_,a_] := 1/2 (x[n-l,a] + a/x[n-l,a]) 
x[l,a_] := a 



2.3 Gnuplot, a professional Plotting Software 



Gnuplot is a powerful plotting programm with a command line interface and a batch inter- 
face. Online documentation can be found on www.gnuplot.info. 



On the command line we can input 
plot [0:10] sin(x) 
to obtain the graph 



Almost arbitrary customization of plots is possible via the batch interface. A simple batch 
hie may contain the lines 




set terminal postscript eps color enhanced 26 

set label "{/Symbol a}=0.01, {/Symbol g}=5" at 0.5, 2. 2 

set output "bucket3.eps" 

plot [b=0 .01:1] a=0.01, c= 5, (a-b-c) / (log(a) - log(b)) \ 

title "({/Symbol a}-{/Symbol b}-{/Symbol g})/(ln{/Symbol a} - ln{/Symbol b})" 



producing a EPS hie with the graph 




y 



3-dimensional plotting is also possible, e.g. with the 
commands 

set isosamples 50 

splot [-pi : pi] [-pi : pi] sin((x**2 + y**3) / (x**2 
+ y**2) ) 

which produces the graph 
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2.4 Short Introduction to MATLAB 

Effective systems: 

• MATLAB & SIMULINK (MathWorks) 

2. 4. 0.2 Some examples as jump start 

0ut(l)=3+2~3 

ans = 11 

Out (2)=sqrt (10) 
ans = 3.1623 

0ut(3)=vpa(sqrt(10) ,60) 

= 3 . 16227766016837933199889354443271853371955513932521682685750 



syms x 
syms y 

y=x~2sin(x) ~2 

2 2 
x sin(x) 

z=int (y,x) 



2 2 
x (- 1/2 cos(x) sin(x) + 1/2 x) - 1/2 x cos(x) + 1/4 cos(x) sin(x) + 1/4 x 

Der=diff (z,x) 

2 2 2 



2 x (- 1/2 cos(x) sin(x) + 1/2 x) + x (1/2 sin(x) - 1/2 cos(x) + 1/2) 



2 



2 2 



- 1/4 cos(x) 
Simple=simplify (Der) 



+ x cos(x) sin(x) 
2 



1/4 sin(x) + 1/4 - x 
2 



Series=Taylor(exp(x) ,6,x,0) 



x sin(x) 
2 3 



4 



5 



1 + x + 1/2 x 

(x+2) ~2+ ( (x+5) '2 (x+y) ~2) ~3 



+ 1/6 x 



2 



+ 1/24 x 
6 



+ 1/120 x 
6 



(x + 2) + (x - 5) (x + y) 

Exp_Pol=expand(Pol) 

2 6 5 4 2 3 3 

4 + 4 x + x + 15625 x + 93750 x y + 234375 x y + 312500 x y 

2 4 5 11 10 2 9 3 

> + 234375 x y + 93750 xy + 6x y + 15 x y + 20 x y 

84 7566 10 92 83 

> +15xy+6xy+xy - 180 x y - 450 x y 600 x y 

7 4 6 5 6 12 11 10 ! 

> - 450 x y - 180 x y + 15625 y + x - 30 x + 375 x - 2500 x 



3 

- 1/3 x 
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8 7 56 98 27 3 

> + 9375 x - 18750 x - 30 x y+ 2250 x y + 5625 x y + 7500 x y 

64 55 46 8 72 

> + 5625 x y + 2250 x y + 375 x y - 15000 x y - 37500 x y 

63 54 45 36 7 

> - 50000 x y 37500 x y - 15000 x y - 2500 x y + 56250 x y 

62 53 44 35 

> + 140625 x y + 187500 x y + 140625 x y + 56250 x y 

2 6 6 5 2 4 3 

> + 9375 x y - 112500 x y - 281250 x y - 375000 x y 

3 4 2 5 6 

> - 281250 x y - 112500 x y - 18750 x y 

t=0 : 0 . 01 :pi 
plot (sin(l . /t) ) 

— Plot Mode 

[X , Y] =meshgrid(-l: 0.01: 1,-1: 0.01:1) 

Z=sin(X. ~2+Y. ~3)/ (X. ~2+Y. ~2) 
surf (X,Y,Z) 



3D Plot 



M 



Contour Plot 




X 




x=l : 1 : 10 
y(l : 10)=x. ~2 



y = 

[ 1, 4, 9, 16, 25, 36, 49, 64, 81, 100] 

A_l= [1 2 4; 5 6 100; -10.1 23 56] 



1.0000 

5.0000 

- 10.1000 



2.0000 4.0000 

6.0000 100.0000 

23.0000 56.0000 



A_2=rand(3,4) 
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A 2 



0.2859 


0.7157 


0.4706 


0 


0 . 5437 


0.8390 


0.5607 


0 


0.9848 


0.4333 


0.2691 


0 


2 ’ = 








0.3077 


0.1387 


0.4756 




0.3625 


0.7881 


0.7803 




0.6685 


0.1335 


0.0216 




0.5598 


0.3008 


0.9394 




.1 . *A_2= 








3.1780 


5.9925 


5.0491 


3 


43.5900 


94.5714 


92.6770 


29 


26.3095 


57.1630 


58.7436 


17 


I L] =lu(A_ 


.1) 






-0.0990 


0.2460 


1.0000 




-0.4950 


1.0000 


0 




1.0000 


0 


0 




-10.1000 


23.0000 


56.0000 




0 


17.3861 


127.7228 




0 


0 


-21.8770 





[Q R] =qr (A_l) 



-0.0884 

-0.4419 

0.8927 

R = 

-11.3142 

0 

0 



-0.2230 

-0.8647 

-0.4501 



17.7035 

-15.9871 

0 



0.9708 

-0.2388 

- 0.0221 



5.4445 

-112.5668 

-21.2384 



b= [1 ; 2;3] 
x=A_l\b 



1 

2 

3 



0.3842 

0.3481 

- 0.0201 

A_3= [1 2 3; -10 5; 8 9 23] 
A_3 = 

1 2 3 

-10 5 



7490 

5039 

6468 



0975 

3559 

5258 
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8 9 23 

Inverse=inv(A_3) 

Inverse = 

-0.8333 -0.3519 0.1852 

1.1667 -0.0185 -0.1481 

-0.1667 0.1296 0.0370 

Example 2.6 (Calculation of Square Roots) 

(*********** root [2] iterative **************) 
function [b] =calculate_Sqrt (a, accuracy) 
clc ; 
x=a; 

delta=inf ; 

while delta>=10~- (accuracy) 

Res(n)=x; 

xn=x; 

x=0 . 5* (x+a/x) ; 
delta=abs (x-xn) ; 

end 
b=Res ; 



2.5 Short Introduction to GNU Octave 

From the Octave homepage: GNU Octave is a high-level interpreted language, primarily 

intended for numerical computations. It provides capabilities for the numerical solution of 
linear and nonlinear problems, and for performing other numerical experiments. It also 
provides extensive graphics capabilities for data visualization and manipulation. Octave is 
normally used through its interactive command line interface, but it can also be used to 
write non-interactive programs. The Octave language is quite similar to Matlab so that 
most programs are easily portable. 

Downloads, Docs, FAQ, etc.: 

http: / / www.gnu.org/ software / octave/ 

Nice Introduction/Overview: 

http: / / math.j acobs-university.de / Oliver / teaching/ iub / resources / octave / octave-intro / octave- 
intro, html 

Plotting in Octave: 

http: / / www.gnu.org/ software / octave/doc / interpreter /Plotting, html 
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// -> comments 
BASICS 



octave :47> 1+1 
ans = 2 

octave :48> x = 2 * 3 
x = 6 

// suppress output 

octave :49> x = 2 * 3; 
octave : 50> 

// help 

octave :53> help sin 
'sin’ is a built-in function 
— Mapping Function: sin (X) 

Compute the sine for each element of X in radians. 



VECTORS AND MATRICES 



// define 2x2 matrix 

octave :1> A = [12; 34] 

A = 

1 2 

3 4 

// define 3x3 matrix 

octave :3> A = [123; 456; 789] 
A = 

12 3 

4 5 6 

7 8 9 



// access single elements 

octave:4> x = A(2,l) 
x = 4 

octave: 17> A(3,3) = 17 
A = 

12 3 

4 5 6 

7 8 17 



// extract submatrices 

octave :8> A 
A = 



12 3 
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4 5 6 

7 8 IT 

octave:9> B = A(l:2,2:3) 
B = 

2 3 

5 6 

octave :36> b=A(l:3,2) 
b = 



2 
5 
8 

// transpose 

octave :25> A ; 
ans = 

14 7 

2 5 8 

3 6 17 

/ / determinant 

octave :26> det(A) 
ans = -24.000 

// solve Ax = b 

// inverse 

octave :22> inv(A) 
ans = 

-1.54167 0.41667 0 

1.08333 0.16667 -0 

0.12500 -0.25000 0 

// define vector b 

octave :27> b = [3 7 12]’ 
b = 

3 

7 

12 

// solution x 

octave :29> x = inv(A) * 
x = 

-0.20833 

1.41667 

0.12500 

octave :30> A * x 
ans = 

3.0000 

7.0000 

12.0000 



.12500 

.25000 

.12500 
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// try A\b 

// illegal operation 
octave :31> x * b 

error: operator *: nonconf ormant arguments (opl is 3x1, op2 is 3x1) 

// therefore allowed 

octave :31> x’ * b 
ans = 10.792 

octave :32> x * b’ 
ans = 

-0.62500 -1.45833 -2.50000 

4.25000 9.91667 17.00000 

0.37500 0.87500 1.50000 

// elementwise operations 

octave :11> a = [123] 
a = 

12 3 

octave :10> b = [456] 
b = 

4 5 6 

octave :12> a*b 

error: operator *: nonconf ormant arguments (opl is 1x3, op2 is 1x3) 
octave :12> a.*b 
ans = 

4 10 18 

octave:23> A = [1 2 ; 3 4] 

A = 

1 2 
3 4 

octave :24> A~2 
ans = 

7 10 

15 22 

octave : 25> A . ~2 
ans = 

1 4 

9 16 

// create special vectors/matrices 

octave :52> x = [0:1:5] 
x = 

0 1 2 3 4 5 

octave :53> A = zeros (2) 

A = 



0 0 
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o o 

octave : 54> A = zeros (2, 3) 
A = 

0 0 0 

0 0 0 

octave:55> A = ones(2,3) 

A = 

111 

111 

octave :56> A = eye (4) 

A = 

Diagonal Matrix 
10 0 0 
0 10 0 

0 0 10 

0 0 0 1 

octave :57> B = A * 5 
B = 

Diagonal Matrix 
5 0 0 0 

0 5 0 0 

0 0 5 0 

0 0 0 5 

// vector/matrix size 

octave :43> size (A) 
ans = 

3 3 

octave :44> size(b) 
ans = 

3 1 

octave :45> size (b)(1) 
ans = 3 



PLOTTING (2D) 



octave :35> x = [-2*pi : 0 . 1 : 2*pi] ; 
octave :36> y = sin(x) ; 
octave:37> plot(x,y) 
octave :38> z = cos(x); 
octave :39> plot(x,z) 

// two curves in one plot 

octave:40> plot(x,y) 
octave :41> hold on 
octave :42> plot(x,z) 



// reset plots 
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octave :50> close all 

// plot different styles 

octave :76> plot(x,z, ’r 1 ) 
octave : 77> plot (x , z , ’ rx ’ ) 
octave:78> plot(x,z, ’go’) 

octave :89> close all 



// manipulate plot 

octave :90> hold on 
octave:91> x = [-pi : 0 . 01 :pi] ; 

// another linewidth 

octave:92> plot(x,sin(x) , ’linewidth’ ,2) 
octave:93> plot(x,cos(x) , ’r’ , ’linewidth’ ,2) 

// define axes range and aspect ratio 

octave :94> axis ( [-pi ,pi , -1 , 1] , ’equal’) 

-> try ’square’ or ’normal’ instead of ’equal’ (help axis) 
/ / legend 

octave :95> legendC ’ sin’ , ’ cos ’ ) 

// set parameters (gca = get current axis) 

octave:99> set (gca, ’keypos ’ , 2) // legend position (1-4) 
octave: 103> set(gca, ’xgrid’ , ’on’) // show grid in x 
octave: 104> set(gca, ’ygrid’ , ’on’) // show grid in y 

// title/labels 



octave : 102> 
octave : 100> 
octave : 101> 



title ( ’OCTAVE DEMO PLOT’) 
xlabel(’unit circle’) 
ylabel( ’trigon. functions’) 



// store as png 



octave :105> print -dpng ’demo_plot .png’ 



OCTAVE DEMO PLOT 




unit circle 



DEFINE FUNCTIONS 
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sigmoid. m: 

function S = sigmoid(X) 
mn = size(X) ; 

S = zeros (mn); 
for i = l:mn(l) 
for j = l:mn(2) 

S(i, j) = 1 / (1 + e * — X (i , j ) ) ; 

end 

end 

end 



easier : 

function S = sigmoid(X) 

S = 1 ./ (1 .+ e (-X) ) ; 
end 



octave :1> sig + [TAB] 
sigmoid sigmoid. m 
octave :1> sigmoid(lO) 
ans = 0.99995 

octave :2> sigmoid([l 10]) 

error: for x~A, A must be square // (if not yet implemented elementwise) 
error: called from: 

error: /home/richard/faculty/adv_math/octave/sigmoid.m at line 3, column 4 

octave :2> sigmoid([l 10]) 
ans = 

0.73106 0.99995 

octave :3> x = [-10:0.01:10]; 

octave:5> plot(x,sigmoid(x) , ’linewidth’ ,3) ; 



PLOTTING (3D) 



// meshgrid 

octave:54> [X,Y] = meshgrid( [1 : 3] , [1 : 3] ) 
X = 

12 3 

12 3 

12 3 



111 
2 2 2 

3 3 3 

// meshgrid with higher resolution (suppress output) 
octave:15> [X,Y] = meshgrid( [-4:0. 2:4] , [-4:0. 2:4] ) ; 
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// function over x and y, remember that cos and sin 
// operate on each element, result is matrix again 

octave:20> Z = cos(X) + sin(1.5*Y); 

// plot 

octave:21> mesh(X,Y,Z) 
octave:22> surf(X,Y,Z) 




octave :44> contour (X,Y,Z) 
octave :45> colorbar 
octave:46> pcolor(X,Y,Z) 




I 



1.5 



0.5 



- 0 



0.5 

I:. 




- 4 - 3 - 2-101234 



RANDOM NUMBERS / HISTOGRAMS 



// equally distributed random numbers 

octave :4> x=rand(l,5) 
x = 

0.71696 0.95553 0.17808 0.82110 0.25843 

octave :5> x=rand(l , 1000) ; 
octave : 6> hist (x) ; 

// normally distributed random numbers 

octave :5> x=randn(l , 1000) ; 
octave : 6> hist (x) ; 
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// try 



octave :5> x=randn(l , 10000) ; 
octave:6> hist(x, 25); 




2.6 Exercises 
Mathematica 



Exercise 2.1 Program the factorial function with Mathematica. 

a) Write an iterative program that calculates the formula n\ = n ■ (n — 1) • . . . • 1. 

b) Write a recursive program that calculates the formula 



n\ = 



n ■ (n — 1)! if n > 1 
1 if n = 1 



analogously to the root example in the script. 



Exercise 2.2 

a) Write a Mathematica program that multiplies two arbitrary matrices. Don’t forget to 
check the dimensions of the two matrices before multiplying. The formula is 

n 

Cij ^ ^ -A-ikBkj ■ 
k = 1 

Try to use the functions Table, Sum and Length only. 

b) Write a Mathematica program that computes the transpose of a matrix using the Table 
function. 

c) Write a Mathematica Program that computes the inverse of a matrix using the function 
Linear Solve. 



MATLAB 

Exercise 2.3 

a) For a finite geometic series we have the formula E™ =0 g* = 1 ~ 1 q _ q ■ Write a MATLAB 
function that takes q and n as inputs and returns the sum. 

b) For an infinite geometic series we have the formula E°L 0 g* = :A_ if the series converges. 
Write a MATLAB function that takes q as input and returns the sum. Your function 
should produce an error if the series diverges. 
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Exercise 2.4 

a) Create a 5 x 10 random Matrix A. 

b) Compute the mean of each column and assign the results to elements of a vector called 
avg. 

c) Compute the standard deviation of each column and assign the results to the elements 
of a vector called s. 



Exercise 2.5 Given the row vectors x = [4, 1, 6, 10, —4, 12, 0.1] and y = [—1, 4, 3, 10, —9, 15, 
compute the following arrays, 

a) ci{j 

b) bn = * 

’ l] Vj 

c) Cj = XiUi, then add up the elements of c using two different programming approaches. 

<n d = 

e) Arrange the elements of x and y in ascending order and calculate being the reciprocal 
of the less Xi and y 3 . 

f) Reverse the order of elements in x and y in one command. 



Exercise 2.6 Write a MATLAB function that calculates recursively the square root of a 
number. 



Analysis Repetition 

Exercise 2.7 In a bucket with capacity v there is a poisonous liquid with volume av. The 
bucket has to be cleaned by repeatedly diluting the liquid with a fixed amount (/3 — a)v 
(0 < (3 < 1 — a) of water and then emptying the bucket. After emptying, the bucket 
always keeps av of its liquid. Cleaning stops when the concentration c n of the poison after 
n iterations is reduced from 1 to c n < e > 0. 

a) Assume a = 0.01, (3 = 1 and e = 10~ 9 . Compute the number of cleaning-iterations. 

b) Compute the total volume of water required for cleaning. 

c) Can the total volume be reduced by reducing /3? If so, determine the optimal (3. 

d) Give a formula for the time required for cleaning the bucket. 

e) How can the time for cleaning the bucket be minimized? 




Chapter 3 

Calculus — Selected Topics 



3.1 Sequences and Convergence 

I Definition 3.1 A function N — > M, n i— ■> a n is called sequence. 
Notation: (a n ) neN or (cq, a 2 , a 3 , ...) 

Example 3.1 

(1, 2, 3, 4, ...) = (n) ne pj 

(I!’!’ |>-) = (^)neN 
(1,2,4,8,16,...) = (2"“ 1 ) neN 

Consider the following sequences: 

1. 1,2,3,5,7,11,13,17,19,23,... 

2. 1,3,6,10,15,21,28,36,45,55,66,.. 

3. 1,1,2,3,5,8,13,21,34,55,89,... 

4. 8,9, 1,-8, -10,-3, 6, 9, 4, -6, -10 



5. 1,2,3,4,6,7,9,10,11,13,14,15,16,17,18,19,21,22,23,24,26,27,29,30,31,32,33,34,35,36, 37,.. 



6. 1,3,5,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,31,33, 35,37,38,39,41,43,. 



Find the next 5 elements of each sequence. If you do not get ahead or want to solve other 
riddles additionaly, have a look at http://www.oeis.org. 
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Definition 3.2 (a n ) nS N is called bounded, if there is A, B G M. with Vn A < a n < B 
(fln)neN is called monotonically increasing/decreasing, iff Vn a n+ 1 > a n (a n+ 1 < 

®n) 



Definition 3.3 A sequence of real numbers (a n ) ng N converges to a G M, iff: 
\/e > 0 3N(e) G N, so that \a n — a\ < e Vn > N{e) 

Notation: lim a n = a 




N(e) 



n 



| Definition 3.4 A sequence is called divergent if it is not convergent. 

Example 3.2 

1. ) (1, converges to 0 (zero sequence) 

2. ) (1, 1, 1, ...) converges to 1 

3. ) (1, —1, 1, —1, ...) is divergent 

4. ) (1, 2, 3, ...) is divergent 

Theorem 3.1 Every convergent sequence is bounded. 

Proof: for e = 1 : Af(l), first iV(l) terms bounded, the rest bounded through a± N( 1). 
Note: Not every bounded sequence does converge! (see exercise 3), but: 



Theorem 3.2 Every bounded monotonic sequence is convergent 
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3.1.1 Sequences and Limits 

Let (a n ), (b n ) two convergent sequences with: lim a n = a, lim b n = b , then it holds: 



lim (a n ± b n ) = lim a n ± lim b n 

n— >oo n— >00 n— >oo 

= a ± b 

lim (c • a n ) = c ■ lim a n 

n— >oo n— >oo 

= c • a 
lim (a n ■ b n ) = a ■ b 



i • i d n 

inn ( — 

■ Un. 



= j if b n , 6 ^ 0 
b 



Example 3.3 Show that the sequence a n = ( 1 H — ) , n G N converges: 



n 



n 


l 


2 


3 


4 


10 


100 


1000 


10000 


CL n 


2 


2.25 


2.37 


2.44 


2.59 


2.705 


2.717 


2.7181 



The numbers (only) suggest that the sequence converges. 

1. Boundedness: Vn a„ > 0 and 



CLn 



1 

1 H — 
n 

1 n(n — 1) 1 n(n — l)(n — 2) 1 1 

1 + n ■ - + -L- — - • ^ + — — • — + — 



n 2 n" 

1 / 1\ 1 

1 + 1 + 2\~ n) + 2^3 

n — 1 

1 

n 



2-3 r n 10 

i _I)(i_ 2 ) + ... + Xi_I)(i _ 2 



11 1 

< 1 + 1 + 2 + 2“ 3 + ' - + ^T 

111 1 

< 1 + 1 + 2 + 4 + 8 + -" + 2^ 

111 

< 1+1+ 2 + 4 + 8 + '" 



= 1 + 



= 3 



1 



2. Monotony: Replacing n by n + 1 in (1.) gives a n < a n+ 1 , since in line 3 most 
summands in a n+ 1 are bigger! 

The limit of this sequence is the Euler number. 

e := lim (l + -V = 2.718281828 .. . 

n— > oo V Tl J 



3.2 Series 




3.2 Series 
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Definition 3.5 Let (a n )neN be a sequence of real numbers. The sequence 

n 

s n := Qfc ,neN 

fc =0 



oo 

of the partial sums is called (infinite) series and is defined by 22 a k- 

k = o 



If (sn)neN converges, we define 




Example 3.4 



n 


0 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 ... 


Sequence a n 


0 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 ... 


n 

Series S n = 


0 


1 


3 


6 


10 


15 


21 


28 


36 


45 


55 ... 


fc=0 

























n 


0 


1 


2 


3 


4 


5 


6 


7 


8 


9 


10 


Sequence a n 




1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


1 


2 


4 


8 


16 


32 


64 


128 


256 


518 


1024 


Series S n 


1 


3 


7 


15 


31 


63 


127 


255 


511 


1023 


2047 






2 


4 


y 


16 


32 


"64" 


128 


256 


512 


1024 


(decimal) 


1 


1.5 


1.75 


1.875 


1.938 


1.969 


1.984 


1.992 


1.996 


1.998 


1.999 



3.2.1 Convergence criteria for series 



OO 

Theorem 3.3 (Cauchy) The series 22 a « converges iff 




n= 0 

Ve > 0 3N e N 


n 


< £ 


for all n > m> N 


k=m 





Proof: Let s p := 22 a k • Then s n — s m _i = a*,. Therefore (s n ) ne pj Cauchy sequence 

fc=0 k=m 

-v4> (s n ) is convergent. 

Theorem 3.4 A series with ak > 0 for k > 1 converges iff the sequence of partial sums 
is bounded. 
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Proof: as exercise 

Theorem 3.5 (Comparison test) 

OO 

Let c n a convergent series with Vn c n > 0 and (a n ) n eN a sequence with \a. 

n = 0 

oo 

N. Then a n converges. 

n = 0 



Theorem 3.6 (Ratio test) 

OO 

Let a n a series with a n ^ 0 for all n > n q. A real number q with 0 < q < 



n = 0 
^n+1 



a r 



< q for all n > no- Then the series "Y^a n converges. 



n= 0 



If, from an index no, 



a n+l 



> 1, then the series is divergent. 



Proof idea (f. 1. Part): Show that |oo|? n is a majorant. 
Example 3.5 



Tl = 0 
OO 9 

71 



E 

»r=0 



converges. 



Proof: 



0 J n + 1 



(n + l) 2 2 n 1, l x9 ^ 1 1 9 

v ; = —(1 q — ) 2 < - 1 + - 2 

2 n + 1 n 2 2 K n’ J 2 l 

for n > 3 



3.2.2 Power series 

Theorem 3.7 and defintion For each i£l the power series 

OO 

exp(x) := Y 

is convergent. 

Proof: The ratio test gives 



5< 1 ' 



n = 0 



£ 

n\ 



^n+1 


= 


x n+l n\ 




\x\ 


1 

1 


d n 


(n + l)!a; n 


n + 1 2 



Definition 3.6 Euler’s number e := exp(l) = Y^ 



n = 0 



n\ 



The function exp : R — > R + x exp(x) is called exponential function. 



< c n Vn G 



1 exists, that 








3.3 Continuity 
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Theorem 3.8 (Remainder) 



N , n 

exp(x) = Y] — - + R n (x) 

Z J 77.1 



n=0 



with |AV(x)| < 2 



x 



N+l 



(N + 1)! 



, , N 

for |x| < 1 + — 



N — th approximation 
or N > 2(|x| — 1) 



3. 2. 2.1 Practical computation of exp(x) : 



N 

E 

71—0 



n 



X 

n\ 



x 



l + x+ — + ...+ 



X N-1 X N 

(At - 1)! + m 



x 



X 



e = 



1+1 h ( '- + jvb (1+ 



;(1 + 






with Rn < 



{N + l)\ 



For At = 15: \Ri 5 \ < ^ < 10" 13 

e = 2.718281828459 ± 2 • i0“ 12 (rounding error 5 times 10~ 13 !) 



Theorem 3.9 The functional equation of the exponential function 
\/x, y G M it holds: exp(x + y) — exp(x) • exp(y). 



Proof: The proof of this theorem is via the series representation (definition 3.6). It is not 
easy, because it requires another theorem about the product of series (not covered here). 

Conclusions: 

a) Wx G M exp(— x) = (exp(x))^ 1 = 

exp[x) 

b) Vx G M exp(x) > 0 

c) Vn G Z exp(n) = e n 

Notation: Also for real numbers iGt: e x exp(x) 

Proof: 

a) exp(x) • exp(— x) = exp(x — x) — exp(0) — \ => exp(— x) = — 

exp i x ) 

x 2 

b) l.Case x > 0 : exp(x) = l + x + — + . . . > 1 > 0 

2 1 

2. Case x < 0 : — x < 0 exp(— x) > 0 =>■ exp(x) = r > 0. 

exp(— x) 

c) Induction exp (1) = e exp (n) = exp (n — 1 + 1) = exp (n — 1) • e = e n_1 • e 
Note: for large x := n + h n G N exp(x) = exp(n + h) = e n ■ exp (h) 

(for large x faster then series expansion) 



3.3 Continuity 

Functions are characterized among others in terms of ’“smoothness”’. The weakest form of 
smoothness is the continuity. 
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Definition 3.7 Let D Cl, /:D-*ia function and a G M. We write 

lim f(x) = C, 

x—>a 

if for each sequence (x n ) ne N, (x n ) G D with lim x n = a holds: 

n— >oo 

lim f(x n ) = C. 




Definition 3.8 For the expression |_.xj denotes the unique integer number n with 

n < x < n + 1. 



Example 3.6 1. limexp(x) = 1 

x — >0 



2. lim [xj does not exist! 

X — >1 

left-side limit ^ right-side limit 



3. Let / : M — > M polynomial of the 
Then it holds: lim f(x) = oo 



and lim f{x) = 

£—>■—00 



form f{x) 


1 = 


x k + 


a\x k 


■ x + 


oo , 


if 


k 


even 




— oo , 


if 


k 


odd 




x k (l + ^ 


+ 


a 2 


... + 




X 




x z 




x k 

V 








*) 





. + a k -ix + a k , k> 1. 



Proof: for x ^ 0 





3.3 Continuity 



39 



since lim g(x ) = 0, it follows lim f(x) = lim x k = oo. 

>oo x — »-oo >oo 

Application: The asymptotic behavior for x — > oo of polynomials is always determinated 
by the highest power in x. 



Definition 3.9 (Continuity) 

Let / :h->Ra function and a E D. The function / is called continuous at point a, if 

lim f(x) = f (a). 

x—>a 

f is called continuous in D, if / is continuous at every point of D. 




For the depicted function it holds 
lim f(x) a. f is discontinuous at the 

x— >oo 

point a. 



Example 3.7 1.) / : x i— > c (constant function) is continuous on whole M. 

2. ) The exponential function is continuous on whole M. 

3. ) The identity function / : x i— > x is continuous on whole M. 



Theorem 3.10 Let f,g : D ^ M functions, that are at a E D continuous and let r E M. 

/ 

Then the functions f + g, rf , f ■ g at point a are continuous, too. If g(a ) ^ 0, then — is 

9 

continuous at a. 



Proof: Let (x n ) a sequence with ( x n ) E D and lim x n = a. 

n— xx) 



to show : 



lim {f + g)(x n ) 

n— >oo 


= (f + 9)(a 


lim ( rf)(x n ) 

n— >cxd 


( rf)(d 


lim (/ ■ g)(x n ) 

n— >o o 

lim (~)(x n ) 

n—>oo g 


= (. f-g)( a 


= (f)(«: 



> holds because of rules for sequences. 



Definition 3.10 Let A, B , C subsets of M with the functions / : A — > B and g : B — > C. 
Then g o / : A —> C, x ^ g(f(x)) is called the composition of / and g. 



f ° 9{x) = 

/ o sin(x) = 
sino^/ [x) = 



i.) 

Example 3.8 2.) 

3.) 



f( 9(x)) 

■^/sin(x) 

sin(i/E) 
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Theorem 3.11 Let / : A —> B continuous at a E A and g : A — > C continuous at y — f (a). 
Then the composition g o / is continuous in a, too. 



Proof: to show: lim x n = a =)> lim f(x n ) = f(a ) 

n— xx) -t n— >oo 



lim g(f(x n )) = g(f(a)). 



continuity of f 



continuity of g 



X 



Example 3.9 — is continuous on whole M, because f(x) = x 2 ,g(x) = f(x) + a and 



x 



x 2 + a 



h(x) = — — — are continuous. 

9{x) 



Theorem 3.12 (e S Definition of Continuity) 

A function / : D — » M is continuous at x$ E D iff: 

Ve > 0 > 0 Vx E D (|x — xq| < S => \ f(x) — f(xo ) | < e) 




Theorem 3.13 Let / : [a, 6] — > M continuous and strictly increasing (or decreasing) and 
A := f(a),B := f(b). Then the inverse function / _1 : [A, B] — > M. {bzw. [B,A] — > M) is 
continuous and strictly increasing (or decreasing), too. 



Example 3.10 (Roots) 

Let k E N, k > 2. The function / : M + — > M + , x t— > x k is continuous and strictly increasing. 

The inverse function f~ l : M + — > M + ,x i— >• \fx is continuous and strictly increasing. 







3.3 Continuity 
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Theorem 3.14 (Intermediate Value) 

Let / : [a, b] — * M. continuous with /(a) < 0 and f{b) > 0. Then there exists a p E [a, b] with 

f(p) = o. 





Note: if /(a) > 0, f(b ) < 0 take — / instead of / and apply the intermediate value theorem. 
Example 3.11 D = Q : i i-> i 2 - 2 = f(x) /( 1) = —1, /( 2) = 2 there is a p E D with 

/(p) = o. 

Corollar 3.3.1 Is / : [a, b] —> M continuous and y is any number between /(a) and f(b), 
then there is at least one XE [a, b\ with f(x ) =y . 




Note: Now it is clear that every continuous function on [a, b] assumes every value in the 
interval [f(a),f (6)]. 

3.3.1 Discontinuity 



Definition 3.11 We write lim f(x) = c (lim f(x) = c ), if for every sequence ( x n ) with 

x\,a x S'a 

x n > a ( x n < a) and lim x n = a holds: lim f(x n ) = c. 

x — >-oo n — ^cxd 

lim f(x) (lim f[x)) is called right-side (left-side) limit of / at x — a. 

ic\a x /'a 



Theorem 3.15 A function is continuous at point a, if the right-side and left-side limit are 
equal. 



Lemma 3.1 A function is discontinuous at the point a, if limit lim f(x) does not exist. 
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Conclusion: A function is discontinuous at the point a, if there are two sequences (x n ), (z n ) 
with lim x n = liui z n = a and lim f(x n ) ^ \imf(z n ). 

Example 3.12 1. Step: lim f{x) = C\ ^ = lim f[x) 

x/*a x\^a 

f(x) = x — n for n — \ < x < n + \ n G Z 




2. Pole: lim f(x) = cx> 

X— >Xq 

or lim f{x) = — oo 

X^Xq 



Example: fix) = — 
x z 

3. Oscillation: 

The function f(x) = sin — , x ^ 0 is discontinuous at x = 0 




X n n— >oo n—HX) X n 



but: let z n = , n G N 

n ■ 7T ^ 

=4* lim z n = 0, lim sin — = 0 

n—>oo n — kxd Z n 

Limit is not unique, therefore sin - is discontinuous. 

Note: Is a function / continuous Vx G [a, b], then it holds for any convergent sequence (x n ) 

lim f(x n ) = /( lim x n ). 

n—>oc n—>oo 

Proof: as exercise 

Conclusion: Continuity of f at xq = lim x n means that / and lim can be exchanged. 

n— »oo n— >oo 



3.4 Taylor— Series 

The Taylor series is a representation of a function as an infinite sum of powers of x. 

Goals: 



3.4 Taylor-Series 
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1. Simple representation of functions as polynomials, i.e. : 

f(x) ~ a 0 + a\X + a 2 x 2 + a 3 x 3 + • • • + a n x n 

2. Approximation of functions in the neighborhood of a point xq. 

Ansatz: 

P(x) = a 0 + ai(x - x 0 ) + a 2 (x - tc 0 ) 2 + 03 (x - x 0 ) 3 H b a n (x - x 0 ) n 

coefficients a 0 , ■ ■ ■ ,a n are sought such that 

f(x) = P(x) + Rn(x) 

with a remainder term R n (x) and lim^oo R n (x) = 0, ideally for all x. 

We require for some point Xq that 

f(x o) = P(x 0 ), f'(x 0 ) = P'{x 0 ), • • • , f {n) (x 0 ) = P [n) (x 0 ) 

Computation of Coefficients: 

P(x 0 ) = a 0 , P\x 0 ) = ai, P"(x 0 ) = 2 a 2 , • • • , P (fc) (x 0 ) = /da fe , • • • 

=► / (fc) 0 o) = k\a k => a k = — ^ 

Result: 

N f \ , /'(x„), N , /"(x 0 ), ^2 , , /^(xo)/ , D , , 

f(x) = f{x o) H — (x-XoJH XT — (x-Xo) H 1 ; (x - x 0 ) +R„(a:) 

v i. £ n_ , 

P(x) 

Example 3.13 Expansion of f(x) = e x in the point Xq = 0: 

/(%) = /( o) = 1, /'(0) — 1, /"(0) = 1, •••, /("> = ! 

2 3 n 

e J =l+£+ — + — + ••• + — + Pn(x) 

2! 3! n! 

l + x + ^ + ^ 

1 + x + ^f 

1 + TC 
1 




-2 



-1 



1 



2 
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Theorem 3.16 (Taylor Formula) Let / C M be an interval and / : / — > M a (n+l)— times 
continuously differentiable function. Then for x G I, Xo G / we have 

^ .f / \ i /'(so), \ , /"(*„), , , / (b) (*o), , r> / \ 

/(x) = /(x 0 )d — (x-x 0 )H — (x-x 0 ) H 1 j — (x Xq) + R n (x) 

1 ! 2 ! n ! 

with 

Rn(x) = — [ (X - t) U f {n+1) [t)dt 

nl ,L n 



Theorem 3.17 (Lagrangian form of the remainder term) Let / : / — > M (n + 1)- 

times continuously differentiable and Xo,x G / . Then there is a z between xo and x such 
that 

f {n+1) ( Z ), 



K(x) = , 1^, 

(n + 1)! 



-(x - x 0 ) 



n+l 



Example 3.14 /(x) = e 1 Theorems 3.16 and 3.17 yield 



e = 



E X 

¥ + 



fc =0 



(n + 1)! 



x 



n+l 



for \z\ < \x\ 



v 

= Rn(x) 



Convergence: 



P \ x i i T r +1 

lK(x)| < 7 L . =: b 



^n +1 




X 




b n 


n + 2 



(n + 1)! 

— ¥ 0 for n —> oo 



the ratio test implies convergence of b n . 



n=0 

lim b n = 0 =>■ lim R n (x) = 0 for all x G 

n— kx) n— >oo 



Thus the Taylor series for e 31 converges to /(x) for all x G M! 
Example 3.15 Evaluation of the integral 

/ \/l + x 3 dx. 



As the function /(x) = \/l + x 3 has no simple antiderivative (primitive function), it can not 
be symbolically integrated. We compute an approximation for the integral by integrating 
the third order Taylor polynomial 

Vl + x 3 = (1 + x 3 ) 1 / 2 



1 + — 
2 



and substituting this into the integral 



" 1 /•! x 3 

\/l + x 3 dx ~ / 1 + — dx = 

i ./n ^ 



4 ' 

X 

X+ J 



9 

= - = 1.125 



The exact value of the integral is about 1.11145, i.e. our approximation error is about 1%. 





3.4 Taylor-Series 
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Definition 3.12 The series Tj(x) = 



f {k) (x 0 ) 



k=0 



k\ 



x — x 0 ) k is called Taylor series of / with 



expansion point (point of approximation) xq. 



Note: 

1. For x = Xo every Taylor series converges. 

2. Bnt for x 7 ^ xo not all Taylor series converge! 

3. A Taylor series converges for exactly these x G / to /(x) for which the remainder term 
from theorem 3.16 (3.17) converges to zero. 

4. Even if the Taylor series of / converges, it does not necessarily converge to /. (— > 
example in the exercises.) 

Example 3.16 (Logarithm series) For 0 < x < 2: 

w . , ( x-1 ) 2 (x - l) 3 (x-1) 4 , 

ln(x) = (x - 1) - - — — — + - — — — - - — ± • • ■ 



Proof: 



In' (a;) = 

x 

Induction: 



ln"(x) = -L = = = (— 



ln (n+ 1 ) (l) = (lnW ( “>)' = = (“I)’ 



n\ 



x 



n+l 



Expansion at xq = 1 



Tin 



ln (fc) (l) 



k = 0 



k\ 



{x — l) k — {x — 1 ) 



(x — l ) 2 (x — l ) 3 (x — l) 4 



+ 



± ... 



This series converges only for 0 < x < 2 (without proof). 



Definition 3.13 If a Taylor series converges for all x in an interval I, we call / the 
convergence area. 

Is I = [xo — r, Xo + r] or I = (xo — r, Xo + r), r is the convergence radius of the Taylor 
series. 



Example 3.17 Relativistic mass increase: 

Einstein: total energy: E = me 2 kinetic energy: Ek m = {in — mo)c 2 



m{v) = 




2 
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to be shown: for t)<c we have C/,. m rs \mQV 2 



for iCl: 



E k in = {m - m 0 )c 2 = 



V 1 - (s) : 



— 1 TTToC 2 



1 n \ i , l (-§) (-:]) 2 

, = (1 - x) 2 = 1 + -X + - — ^ — — x 2 + 

v 1 — x 2 2! 

1 1 3 2 

= 1 + 2 a;+ 8 a: + "' 



1 1 

7T^ x1 + 2 x 

\ v 2 \ i 3 u 4 

kin ~l 1+ 2^2 _1 ) m ° c2 = 2 m °^ 2 + g" 10 ^ + ’ ‘ 



E\ 



3.5 Differential Calculus in many Variables 



/ : M n -»• R 

{xi,X 2 , ■ ■ ■ , x n ) l-> y = /(xi, X 2 , • • • , £„) 
or 

x^y = f(x) 



3.5.1 The Vector Space M n 

In order to “compare” vectors, we use a norm: 

Definition 3.14 Any mapping || || : M n — > R, x i— > ||x|| is called Norm if and only if 

1. || a; || = 0 iff x — 0 

2. || Ax || = | A | || as || VA e R, x e R n 

3. || x + y\\ < || m || + ||y|| Wx,y G M" triangle inequation 

the particular norm we will use here is the 
Definition 3.15 (Euklidian Norm ) 

The function | | : R n — ■> R + U {0}, x » \J x\ + • • • + x\ is the Euklidian Norm of the 
vector x. 

Lemma: Die Euklidian norm is a norm. 

Theorem 3.18 For x e R n we have x 2 = xx = \x\ 2 
Proof as exercise. 



Note: The scalar product in R n induces the Euklidian norm. 







3.5 Differential Calculus in many Variables 



47 



3.5.2 Sequences and Series in M n 

analogous to Sequences and Series in M! 

I Definition 3.16 A mapping N — * M n ,n i— > a n is called sequence. 

Notation: (a n ) neN 

Example 3.18 




Definition 3.17 A sequence (a n ) ne N of vectors a n e M n converges to a 6 if 
Ve > 0 3N(e) e N \a n — a\ < e V n>N(e) 

Notation: lim a n = a 



Theorem 3.19 A (vector) sequence (a n )neN converges to a if and only if all its coordinate 
sequences converge to the respective coordinates of a. ( Proof as exercise.) 

Notation: 



/ a\ \ 



ak = 



( a fc)fceN a k £ 



\<J 



Note: Theorem 3.19 enables us to lift most properties of sequences of real numbers to 

sequences of vectors. 



3.5.3 Functions from M n to M m 

m — 1 : Functions / from D C M. n to5cM have the form 



f.D^B , *-/(*) 

/ X] \ 



^ f(x 1 , ■ • • ,x n ) 



\ X n 



Example 3.19 

f(x i,x 2 ) = sin(xi + 111 x 2 ) 

m/1: Functions / from D C to B C have the form 
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/ x i \ / fi(x 1 , ■ ■ ■ ,x n ) \ 

: i— > : 

\x n J y f m (x i , ■■■ ,x n ) y 



Example 3.20 

1 . 



/: 



X\ 

X2 

x 3 



\Jx\x 2 x 3 
COSXi + sillX 2 



2. Weather parameters: temperature, air pressure and humidity at any point on the earth 
/ : [0°, 360°] x [—90°, 90°] -> [-270, oo] x [0, oo] x [0,100%] 



temperature(Q, <£>) 
air pressure^, <f>) 
humidity(Q , <L) 



Note: The components , f m ( x) can be viewed (analysed) independently. Thus, 

in the following we can restrict ourselves to / : R n — > M. 



3. 5.3.1 Contour Plots 



Definition 3.18 Let D C M 2 , 5cl,c6B,/:h->B. The set {(xi, x 2 )|/(xi, x 2 ) = c} 
is called contour of / to the niveau c. 



Example 3.21 /(x i,x 2 ) = X]X 2 



X]X 2 = c 



for 



(hyperbolas) 



Xi % 0 : x 2 



c 

X\ 



c = 0 Xi = 0 V x 2 



0 




Contours -> {0,1, 2, 3,4, 5, 6, 7, 8, 9,-1, 

-2 , -3 , -4 , -5 , -6 , -7 , -8 , -9} , 
PlotPoints -> 60] 



Plot3D[x y, {x,-3,3>, {y,-3,3} 
PlotPoints -> 30] 
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3.5.4 Continuity in M n 

analogous to continuity of functions in one variable: 



Definition 3.19 Let / : D — > M m a function and a G M. n . If there is a sequence (a n ) 
(maybe more than one sequence) with lim a n = a, we write 

71— XX) 



lim fix) = c, 
x^a J v 



if for any sequence ( x n ),x n G -D with lim = a: 

71— XX) 



lim f(x n ) = c 

71 — XX) 



Definition 3.20 (Continuity) 

Let / : — > M m a function and a G D. The function / is continuous in a. if lim /(a)) = 



/(a). / is continuous in /l. if / is continuous in all points in /l. 



Note: These definitions are analogous to the one-dimensional case. 



Theorem 3.20 If / : D — > M m . g : D — ► M m , h : D — > M are continuous in cc 0 G -D, then 

f 

f + g.f — and 4- (if h(a:o)) / 0 ) are continuous in a? 0 - 



3.5.5 Differentiation of Functions in M n 



3. 5. 5.1 Partial Derivatives 
Example 3.22 

/ : M 2 -> R 
f(x 1 ,x 2 ) = 2xlxl 

keep x 2 = const., and compute the 1-dim. derivative of / w.r.t. x\\ 

Qf 

-7^~{x i,x 2 ) = f Xl (x i,x 2 ) = Axixl 

OX 1 



analogous with x\ = const. 



second derivatives: 



df_ 

dx 2 



a 2 2 
ox 1 x 2 



^^(xi,x 2 ) = l2x lX 2 2 
x ,x 2 ) = 12x\x\ 



d df 



d df 



dxi dx 2 dx 2 dxi 
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Example 3.23 

$(«, V, w) = uv + cos w 
® u (u,v,w) = V 
$ v (u, V, w) — u 
$ w (u,v,w) = — sinw 



Definition 3.21 If f(x) = 



/ ■■■ ,x n ) \ 

\ fm(x 1 ) ' ' ' i^n) J 



is partially differentiable in x = x 0 , i.e. 



all partial Derivatives 4^-(&o)(* = !,••• ,m,k — 1, ■ • • , n) exist, then the matrix 









dfi 

dx n 


/'(* o) = 


^(*o) 


§&(*«) " 


dh_ 

dx n 




tf(-o) 


§fe(*o) ■■ 


dfm 




dx n 


is called Jacobian matrix. 








Example 3.24 Linearisation of 


a function: / : M 2 — > M 3 


in rco 




/ 


2a; 2 N 


\ 


/(*) = 


sin (re i + x 2 ) 






V 


In(rci) + x 2 , 


/ 



/'(* ) = 



0 2 
cos(rri + x 2 ) cos(rci + x 2 ) 

1 



XI 



1-dimensional 



f\x 0 ) = lim 



f(x) - f(x q) 



x<->x 0 X — Xq 




n 



Linearisation g of / in Xq — l g 

9 (x i,x 2 ) =/(vr,0) +/ / (tt,0) 



Xi 

X 2 



g(x i,x 2 ) = 



o 

o 

ln7r 



+ 



0 2 

-1 -1 

1 1 



X\ — 7T 

X2 



2x 2 

-Xi - x 2 + n 
— + x 2 + In 7T - 1 
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Note: For x — » xq i.e. close to Xq the linearisation g is a good approximation to / (under 

which condition?). 



Example 3.25 We examine the function / : M 2 — > M with 



f(x,y) 



~t£= 2 if(x,y)^( 0 , 0 ) 

o if {%, y) — (o, o) 



Differentiability: / is differentiable on M 2 \{(0,0)} since it is built up of differentiable 
functions by sum, product and division. 



dx 



(x,y) 



df 

dx 



(o,y) 



d£ 

dx 



i x i 0 ) 



y x 2 y 

V x2 + y 2 ( x 2 + y 2 )5 

- — l 

y 

o 



d f d f 

lim — (0, y ) 7^ lim — (x, 0) 
y^o dx y ,yy r x^o dx y ’ ' 

the partial derivative ^ is not continuous in (0, 0). =7- / is in (0, 0) not differentiable 



Symmetries: 



1. / is symmetric wrt. exchange of x and y, i.e. w.r.t. the plane y = x. 

2 . / is symmetric wrt. exchange of x and —y, i.e. w.r.t. the plane y = —x. 

3. f(—x,y) = —f(x,y), cl.h. / is symmetric w.r.t. the y-axis. 

4. f(x, —y) = —f(x,y), cl.h. / is symmetric w.r.t. the x-axis. 

Contours: 

. '' = = = c x y — c \J x 1 + y 2 =>- x 2 y 2 = c 2 (x 2 + y 2 ) 

V x2 + y 2 

y 2 { x2 — c 2 ) = c 2 x 2 y = ± CX = 

V x 2 — c 2 

Contours: 

j=£=s if c > 0, x > 0 (1. Quadr.) and c < 0,a; < 0 (2. Quadr.) 

^^_ c 2 if c > 0, x < 0 (3. Quadr.) and c < 0,a: > 0 (4. Quadr.) 




Signs in the quadrants: 

- I + 



A 
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Continuity: / is continuous on M 2 \{(0,0)}, since it is built up of continuous functions by 
sum, product and division. 

Continuity in (0,0): 

Let e > 0 such that \x\ — e, i.e. £ = \/x 2 + y 2 <=> y — ±V 'e 2 — x 2 



f(x,y) = ± 



X 



Ve 23 



x- 



x 



iLLETZd = ±I v ^3 7? 



from |x| <ewe get 



| f(x, y)\ < \x\ — £ and 



hin f(x, y) = 0 



Thus / is continuous in (0,0). 



3. 5. 5. 2 The Gradient 



Definition 3.22 f : D —> M(D C M n ) 

The Vector grad/(*) := f\x) T = 



^(*) \ 



dx\ 

df 



V gf(-) 



is called gradient of /. 



The gradient of / points in the direction of the steepest ascent of /. 
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Example 3.26 



f(x,y ) = x 2 + y 2 

df , \ 9 

— (x,?/) = 2x 
=> grad f{x,y) = 



d _l 

dy 

2x 
2 V 



(x,y) 






3.5. 5. 3 

Let / : D 



Higher Partial Derivatives 

df ° 

— > M m (D C M n ). Thus far(x) is again a function mapping from D to M m and 

d ( df \ . , d 2 f 



dxk \dxi 



x =: 



dxkdxi 



(*) =f Xi ,x k ( x ) 



is well defined. 



Theorem 3.21 Let D C M n open and / : D — » M m two times partially differentiable. Then 
we have for all x 0 E D and all i, j — 1, ■ ■ • , n 



a 2 / 

dxidxj 



(*o) 



a 2 / 

dxjdxi 



(*o) 



Consequence: If / : D — > C M n open) is k-times continuously partially differentiable, 

then 

d k f _ d k f 
dx ik dx ik _ x ■ ■ ■ dx h dx lu(k) . . . dx in(1) 

for any Permutation II of the numbers 1 ,k. 



3. 5. 5. 4 The Total Differential 



If / : M n — > M is differentiable, then the tangential mapping f t (x ) = f(x 0 ) + f'(x 0 )(* — £C 0 ) 
represents a good approximation to the function / in the neighborhood of x 0 which can be 
seen in 

ft(x) - f(x 0 ) = f(x 0 )(x - x 0 ). 



With 

df(x ) := f t (x) - f(x 0 ) « f(x) - f(x q) 



and 



dx 



/ dx i ^ 

V dx n / 



:= x — x 0 
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we get: 



or 



df(x ) = f(x 0 )dx 



df(x ) = ^2 ^-(x 0 )dx k = ^(x^dx i H f dx r 



k = 1 



(9a; i 



<9au 



n ^ Qj 

Definition 3.23 The linear mapping df = — — (x 0 )dx k is called total differential 

. ( J Jj Zr* 

fc=l K 

of the function / in x$. 



Note: Since in a neighborhood of Xq, f t is a good approximation of the function /, we 

have for all x close to xq\ 

df(x ) /(®) - f(xo). 

Thus df(x) gives the approximate deviation of the function value f(x) from f(x 0 ), when x 
deviates from x 0 a little bit. 



3. 5. 5. 5 Application: The Law of Error Propagation 

Example 3.27 For a distance of s = 10 km a runner needs the time of t — 30 min yielding 
an average speed of v = | = 20^. Let the measurement error for the distance s be 
As = ±1 m and for the time we have At = ±1 sec. Give an upper bound on the propagated 
error Av for the average speed! 



This can be solved as follows. To the given measurements Xi, ■ ■ ■ , x n , a, function / : M n — > M 
has to be applied. The measurement error for xi, ■ ■ ■ , x n is given as ±Axi, • • • , ±Aa; n 
(A Xi >0 Vi = 1, • • • , n). The law of error propagation gives as a rough upper bound for 
the error A f(x) of f(x i, • • • ,x n ) the assessment 



Af(x ,x n ) < 



dl 

dxi 



x 



Ax 1 + . . . + 



df_ 

dx„ 



X 



Ax r 



Definition 3.24 We call 

Afm,ax(x\i i Xn) . 



df 



dxi 



, x 



Axi + . . . + 



df 



9x ri 



x , 



Ax r 



• Af (3/) • • 

the maximum error of /. The ratio J j^x ) — ” ^ ie re l a fi ve maximum error 



Note: A f max typically gives a too high estimate for the error of /, because this value 

only occurs if all measurement errors dx { , • • • , dx n add up with the same sign. This formula 
should be applied for about n < 5. 
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Definition 3.25 When the number of measurements n becomes large, a better estimate 
for the error A/ is given by the formula 




for the mean error of /. 




Ax\ + . . . + 




Ax 



n 



Example 3.28 Solution fo example 3.27. Application of the maximum error formula leads 
to 



A v(s, t ) = 



dv . , 




dv , . 




1 




s 




As + 




At = 


t 


As + 





a As s 

At = T + e A ‘ 



0.001 km 10 km 1 



0.5 h 0.25 h 2 3600 



40 



ft = | 0.002 + — ) ^ = 0.013^ 

3600 / h h 



This can be compactly written as the result v = (20 ± 0.013) 



Definition 3.26 Let / : D — > M two times continuously differentiable. The n x n Matrix 

/ SW ••• sSrrMl 



(Hess/) (a;) := 

is the Hesse— Matrix of / in x. 



dx \ ' 

d 2 f 

\ dx n dxi 



X 



a 2 f 

dxl 



(*) / 



Note: Hess/ is symmetric, since 

d 2 f _ d 2 f 

dxidxj dxjdxi 



3.5.6 Extrema without Constraints 

Again we appeal to your memories of one-dimensional analysis: How do you determine 
extrema of a function / : M — ► M? This is just a special case of what we do now. 
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Definition 3.27 Let D C and / : D — >• M a function. A point x £ D is a local 
maximum (minimum) of /, if there is a neighborhood U C D of x such that 

/(*) > /(y) (/(*) < f{y)) v z/ e u. 

Analogously, we have an isolated local Maximum (Minimum) in x, if there is a 
neighborhood U C D of x such that 

/(*) > f(y) (bzw. f(x) < f(y )) Vy eU, y ± x 

All these points are called extrema. 

If the mentioned neighborhood U of an extremum is the whole domain, i.e. U = D, then 
the extremum is global. 



Give all local, global, isolated and non-isolated maxima and minima of the function shown 
in the following graphs: 




Plot3D [f [x,y] , {x,-5,5},{y,-5,5]-, PlotPoints 
-> 30] 




ContourPlot [f [x,y] , 
{x,-5,5},{y,-5,5}, PlotPoints 
-> 60, ContourSmoothing -> 
True,ContourShading-> False] 



Theorem 3.22 Let D C R" be open and / : D — > R partially differentiable . If / has a 
local extremum in x 6 D, then grad f(x) = 0. 



Proof: Reduction on l~dim. case: 

For i — 1, • • • ,n define gi(h) := fix i, • • • , Xi + h, ■ ■ ■ , x n ). If / has a local extremum in x, 
then all g* have a local extremum in 0. Thus we have for all i: g[{ 0) = 0. Since g[{ 0) = c) {^ ] 
we get 



/ ^(*) \ 



grad/ ( x) = 



\ &(*) J 



= 0 



Note: 

• Theorem 3.22 represents a necessary condition for local extrema. 






3.5 Differential Calculus in many Variables 



57 



• Why is the proposition of Theorem 3.22 false if D C M n is no open set? 



Linear algebra reminder: 



Definition 3.28 Let A a symmetric n x n Matrix of real numbers. 

A is positive (negative) definite, if all eigenvalues of A are positive (negative). 
A is positive (negative) semidefinite, if all eigenvalues are > 0 (< 0). 

A is indefinite, if all eigenvalues are ^ 0 and there exist positive as well as negative 
eigenvalues. 



Theorem 3.23 Criterium of Hurwitz 

Let A real valued symmetric matrix. A ist positive definite, if and only if for k 

a n • • • a\k 



> 0 



Ofel ' ' ' a kk 

A is negative definite if and only if -A is positive definite. 



!,••• ,n 



Theorem 3.24 

with grad/ (x) = 


For D C M n open and two times continuously differentiable f : D - 
- 0 for x e D the following holds: 


R 


a ) 


(Hess/) (x) 


positive definite =>■ / has in x an isolated minimum 




b) 


(Hess/) (ah 


) negative definite / has in x an isolated maximum 




c ) 


(Hess/) (a; ) 


indefinite / has in x no local extremum. 





Note: Theorem 3.24 is void if (Hess/) (a;) is positive oder negative semidefinite. 

Procedure for the application of theorems 3.22 and 3.23 to search local extrema 
of a function f : (D C W 1 ) — ■> R: 

1. Computation of grad / 

2. Computation of the zeros grad / 

3. Computation of the Hessian matrix Hess/ 

4. Evaluation of Hess/(cc) for all zeros x of grad/. 

Example 3.29 Some simple functions / : M 2 — > R: 

1- f(x,y) = x 2 + y 2 + c 

grad f(x,y) = f ^ ^ S r ad/(0,0) = ( = 0 
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is positive definite on all M 2 . 
2- f{x,y) = -x 2 - y 2 + c 



Kess/ = ( o ° 

/ has an isolated local minimum in 0 (paraboloid). 



grad/(0,0) = 0 Hess / 



-2 0 \ 
0 -2 ) 



isolated local maximum in 0 (paraboloid). 
3. f(x, y) = ax + by + c a,b ^ 0 



grad / 



l ) # o Vx e E 2 



no local extremum. 
4- /Oc?/) = a: 2 - y 2 + c 



grad f(x,y) = 



2x 
-2 y 



grad/(0, 0) = 0 



2 0 
0 -2 



Hess / = 

Hess/ indefinite / has no local extremum 
5- f(x,y) = x 2 +y 4 



grad / = 



2a: 

4 2/ 3 



grad/(0, 0) = 0 



2 0 
0 0 



Hess/(0, 0) = 

Hess/ positive smidefinite, but / has in 0 an isolated minimum. 
6- f(x,y) = x 2 



grad / = 



2a: 

0 



Hess/(0, 0) = 



grad/(0, y) = 0 



2 0 
0 0 



Hess/ positive semidefinite, but / has a (non isolated) local minimum. All points 
on the y-axis ( x = 0) are local minima. 



7- f{x,y) = x 2 + y 3 



grad/ (a;, y) = ^ ^ 

Hess/(0, 0) = 



grad/(0, 0) = 0 



2 0 
0 0 



=> Hess/ positive semidefinite, but / has no local extremum. 
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3.5.7 Extrema with Constraints 

Example 3.30 Which rectangle (length x, width y ) has maximal area, given the perimeter 
U. 



Area f(x,y ) = xy. The function f(x,y) has no local maximum on M 2 ! 

Constraint: U = 2{x + y) or x + y = ^ substituted in f(x, y) = xy 

, , U , , U , U 2 

=* 9{ x ) ■= JK X , — ~x) = x{— - x) = —x - x 

,, . U 

g(x) = --2x = 0 



T - U 
■ l ^ 4 



y= ? 



® = J/ 



»"(i) = - 2 

x — y — U / A ist (the unique) maximum of the area for constant perimeter U\ 

In many cases substitution of constraints is not feasible! 

Wanted: Extremum of a function f(x i, • • • ,x n ) under the p constraints 

hi(xi, ■ ■ ■ ,x n ) = 0 



h p (x i, • • • ,x n ) = 0 



Theorem 3.25 Let / : D — > M and h : D — > M p be continuously differentiable functions on 
an open set D C M n , n > p and the matrix h'(x) has rank p for all x e D. 

If x 0 G D is an extremum of / under the constraint(s) h(x$) = 0, there exist real 
numbers Aj , • • * , \ p with 



df_ 

dxi 



p 

(*o ) + a k 

k = i 



dhk 

dxi 



(*o) = 0 



Vi 



!,••• , n 



and 

hk(x 0 ) = 0 V/c = 1, • • • ,p 



Illustration: 

For p = 1, i.e. only one given constraint, the theorem implies that for an extremum Xo of / 
under the constraint h(x o) = 0 we have 

grad f(x 0 ) + Agradh(a:o) = 0 

• grad / and grad/i are parallel in the extremum x 0 \ 

• Contours of / and h for h(x) = 0 are parallel in xq. 
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• The numbers Ai, • • • ,X P are the Lagrange multipliers. 

Note: We have to solve n + p equations with n + p unknowns. Among the solutions of 

this (possibly nonlinear) system the extrema have to be determined. Not all solutions need 
to be extrema of / under the constraint(s) h(x 0 ) = 0 (necessary but not sufficient condition 
for extrema.) 



Definition 3.29 Let /, h be given as in theorem 3.25. The function L : D — » M. 

p 

L(x 1, ■■■ ,x n ) = f(x 1, • • • , Xn) + ^2 ^hk(x 1, • • • ,x n ) 

k = 1 

is called Lagrange function. 



Conclusion: The equations to be solved in theorem 3.25 can be represented as: 

^—(x) = 0 (i = l,---,n) 

OXi 

h k (x) = 0 (k = 1, • ■ ■ ,p) 

Example 3.31 Extrema of f(x, y ) = x 2 +y 2 + 3 under the constraint h(x, y) = x 2 +y — 2 



0 



2 2 2 

Contours of x +y +3 and constraint x +y-2=0 




x 



L(x,y) 



x 2 + y 2 + 3 + \{x 2 + y — 2) 
2x + 2Xx 

2y + X 



grad L(x,y) = 0 , h(x,y ) = 0 
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2x + 2Xx = 0 (1) 

2y + X = 0 (2) 

x 2 + y- 2 = 0 (3) 

(2) in (1): 2a: — 4 xy = 0 (4) 

y = 2 — x 2 (3a) 
(3a) in (4): 2x — 4a;(2 — x 2 ) = 0 



f “ tion: * 1 = (“) isam — ■ 

2 - 8 + 4a: 2 = 0 

4a: 2 = 6 
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Example 3.32 Extrema of the function f(x,y ) = 4a; 2 — 3 xy on the disc 

K o . i = i( x ,y )\ x2 + y 2 < i}- 




Show [ContourPlot [4*x~2 - 3 *x*y, {x,-l,l}, PlotPoints -> 60, 

Contours -> 20, ContourSmoothing -> True, Contour Shading -> False, PlotLabel -> " "] , 
Plot [{Sqrt [l-x~2] , -Sqrt [l-x~2] } , {x,-l,l}], AspectRatio -> 1 ] 



1. local extrema inside the disc Dq p 

grad f(x,y) = 

0 ' 



8a; — 3 y 
—3a; 



=>■ x = 



0 



is the unique zero of the gradient. 



Hess / = 



8 -3 

-3 0 



0 



8 -3 

-3 0 



0 - 9 = -9 



=$■ Hess/ is neither positive nor negative definite. Eigenvalues of Hess/ =: A 









Ax = \x {A — \)x = 0 

8 — A -3 A _ , / 8 — A -3 

-3 -A J a, = 0ttdet f -3 -A 

(8 — A)(— A) — 9 = 0 A 2 — 8A — 9 

Ai ) 2 = 4 ± V16 + 9 
Ai = 9 
A 2 = — 1 



Hess/ is indefinite 

=>■ / has no local extremum on any open set D. 
=>■ in particular / has on Dq 1 no extremum! 



= 0 



0 
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2. Local extrema on the margin, i.e. on dD^y. local extrema von f(x,y) = 4a: 2 — 3 xy 
under the constraint x 2 + y 2 — 1 = 0: 

Lagrangefunction L = 4a: 2 — 3 xy ± \(x 2 ± y 2 — 1) 

d l 

— — = 8x — 3y + 2Xx = (2A + 8)x — 3 y 
ox 

& l 

— = -3x + 2A y 
dy 

Equations for x,y, A: 



(1) 


8a: — 3 y + 2\x 


= 0 


(2) 


—3x ± 2A y 


= 0 


(3) 


x 2 ± y 2 — 1 


= 0 



(1 )y — (2)x = (4) 8 xy — 3 y 2 ± 3a: 2 = 0 
first solution: (3) =>■ (3a) : y 2 = 1 — x 2 

(3a)m(4) : ±8aVl - a: 2 - 3(1 - x 2 ) ± 3a: 2 = 0 

Subst.: x 2 = u : ±8vAa/1 — u = 3(1 — u) — 3u = 3 — 6m 



squaring: 



Contours: 



64m (1 — u) 
-64 u 2 ± 64m — 36m 2 + 36m — 9 
-100m 2 + 100m - 9 



7/2 — 7/ _1_ _9_ 

a a -r 10Q 



9 — 36m + 36m 2 
0 
0 
0 



Ul > 2 2 \ 4 100 



y = 

3 



3x 



9 


— 


1 ± a/ 


/ 25—9 _ 1 i 4 


100 




2 V 


100 2 ^ 10 


Mi 


= 


0.1 




u 2 


= 


0.9 




Xl,2 


= 


±±_ 


« ±0.3162 


%3,4 


= 


± — 
31 Cio 


« ±0.9487 


Ax 2 — 


3 xy 


= c 




Ax 2 


A 


c 




— 


-x ■ 


— 






3 


3x 





X3 = 



vlo 

/ 3 1 



=> V 3 = ±\/ l-xl = 



VTo 



J V\/Io vToy 10 10 10 

3 1 A 9 3 45 

= 4— + 3— = — 



/ 



a/Io’ vW 10 10 10 



f(x,y ) has on A'o,i in a?i = I v/ p ) and in x 2 = ( ] isolated local maxima 



s/W 

1 

f(x,y) has on K 01 in x 3 = ( pp ] and in x 4 = 

TTo 



vho 
l 

PP | isolated local minima. 
47To 
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3. 5. 7.1 The Bordered Hessian 



In order to check whether a candidate point for a constrained extremum is a maximum or 
minimum, we need a sufficient condition, similarly to the definiteness of the Hessian in the 
unconstrained case. Here we need the Bordered Hessian 



/ 0 ... 0 ... \ 

UX\ UXfi 



dhp dh p 



dhi 


dhp 


dx i 
d 2 L 


dx n 

8 2 L 


dxi 


dxi 


dx\ 


dx\dx. 


dhi 


dhp 


8 2 L 


8 2 L 


dx n 


dx n 


dx n dx\ 


dx 2 



This matrix can be used to check on local minima and maxima by computing certain subde- 
terminants. Here we show this only for the two dimensional case with one constraint where 
the bordered Hessian has the form 



Hess : = 



( 0 


dh 

dx\ 


dh 

dx 2 


dh 


d 2 L 


d 2 L 


dx\ 


dx^ 


dx \dx 2 


dh 


d 2 L 


d 2 L 


\ dx 2 


dx 2 dx\ 


dx'i 



and the sufficient criterion for local extrema is (in contrast to the unconstrained case!) the 
following simple determinant condition: 

Under the constraint h(x,y ) = 0 the function / has in (x,y) a 

• local maximum, if |Hess(x,y)| > 0 

• local minimum, if |Hess(a;, y)\ < 0. 

If |Hess(a:, y)\ = 0, we can not decide on the properties of the stationary point (x,y). 
Application to example 3.31 yields 



grad L(x, y) = 



2x(l + A) 
2y + A 



0 



2x 



1 



Hess (x,y) — | 2x 2(1 + A) 0 
10 2 

Substitution of the first solution of gradL = 0 which is x = 0, y = 2, A = —4 into this matrix 
gives 

0 0 1 



|Hess(0, 2)| = 

which proves that we indeed have a maximum in (0, 2). 



0-6 0 

10 2 



= 6 



3.6 Exercises 



65 



3.6 Exercises 



Sequences, Series, Continuity 



Exercise 3.1 Prove (e.g. with complete induction) that for p e M it holds: 

±(p+k) = {n + 1) f p+n) 

k = o 2 



Exercise 3.2 

a) Calculate 




l + a/TTTTT, 



i.e. the limit of the sequence (a n ) ne pj with ao = 1 and a n+ \ = y / 1 + a n . Give an exact 
solution as well as an approximation with a precision of 10 decimal places. 

b) Prove that the sequence (a„,) ne N converges. 



Exercise 3.3 Calculate 



1 + 



1 + 



i+ 



i+i 



i.e. the limit of the sequence (a n ) ne pj with a 0 = 1 and a n+ 1 = l + l/a n . Give an exact solution 
as well as an approximation with a precision of 10 decimal places. 



Exercise 3.4 Calculate the number of possible draws in the German lottery, which result 
in having three correct numbers. In German lottery, 6 balls are drawn out of 49. The 49 
balls are numbered from 1-49. A drawn ball is not put back into the pot. In each lottery 
ticket field, the player chooses 6 numbers out of 49. Then, what is the probability to have 
three correct numbers? 



Exercise 3.5 Investigate the sequence (a n ) ng pj with a n : = 
regarding convergence. 





1 1 

- + . . . H 
5 n 



Exercise 3.6 



Calculate the infinite sum 




n = 0 



Exercise 3.7 Prove: A series YlkLo with VA; : a k> 0 converges if and only if the sequence 
of the partial sums is limited. 



Exercise 3.8 Calculate an approximation (if possible) for the following series and investigate 
their convergence. 



OO OO OO 

a) I) 2 "” b ) ^4 n (n + l)!n“ n c) ^ 3n[4 + (l/n)]" n 

n = 0 n=0 7i—0 



Exercise 3.9 Investigate the following functions / : M — > M regarding continuity (give an 
outline for each graph): 



a) f(x) = 1 b) f(x) 

1 + e x 



0 if x = 1 
i=T else 



c) fix) 



x + 4 if x > 0 
(x + 4) 2 else 
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d) f(x) 



(x — 2) 2 if x > 0 
(x + 2) 2 else 



e) f(x) = \x 



f) f(x) 







1 




x— L-xJ 


g) f{x) = 


_ x + 2. 


— X 



Exercise 3.10 Show that / : M. — > M with 

J 0 falls x rational 
^ [1 falls x irrational 

is not continuous in any point. 



Taylor— Series 



Exercise 3.11 Calculate the Taylor series of sine and cosine with xo = 0. Prove that the 
Taylor series of sine converges towards the sine function. 

Exercise 3.12 Try to expand the function f(x) = \fx at xq = 0 and xq — 1 into a Taylor 
series. Report about possible problems. 

Exercise 3.13 Let / be expandable into a Taylor series on the interval (— r, r) around 0 
((r > 0). Prove: 

a) If / is an even function (f(x) = f(—x)) for all x G (— r, r), then only even exponents 

OO 

appear in the Taylor series of /, it has the form E 

k = 0 

b) If / is an odd function (f(x) = —f(—x)) for all x G (— r, r), then only odd exponents 

OO 

appear in the Taylor series of /, it has the form E &2k+lX 

k = 0 

Exercise 3.14 Calculate the Taylor series of the function 



f(x) 



e ^ if x 7 ^ 0 

0 if x — 0 



at xo = 0 and analyse the series for convergence. Justify the result! 

Exercise 3.15 Calculate the Taylor series of the function arctan in xo = 0. Use the result 
for the approximate calculation of tt. (Lise for this for example tan( 7 r/ 4 ) = 1.) 



Functions from W 1 to M m 

Exercise 3.16 Prove that the scalar product of a vector x with itself is equal to the square 
of its length (norm). 

Exercise 3.17 

a) Give a formal definition of the function / : M — > M + U {0} with f(x) = |x|. 

b) Prove that for all real numbers x,y \x + y\ < |x| + \y\. 

Exercise 3.18 

a) In industrial production in the quality control, components are measured and the values 
X\, . . . x n determinated. The vector d = x — s indicates the deviation of the measurements 
to the nominal values si, . . . ,s n . Now define a norm on M™ such that ||d|| < £ holds, iff 
all deviations from the nominal value are less than a given tolerance e. 
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b) Prove that the in a) defined norm satisfies all axioms of a norm. 

Exercise 3.19 Draw the graph of the following functions / : M 2 — > R (first manually and 
then by the computer!): 



fi(x,y) = x 2 + y 3 , f 2 (x,y) 



x 2 + e~^ 2 



fs(x,y) 



X 2 _|_ g-(5 (x+y)) 2 _|_ e ~(5 (x-y)) 2 



Exercise 3.20 Calculate the partial derivatives of the following functions 

/ : M 3 -f M 



a) f(x) = \x\ b) f(x) = x * 2 + x * 3 c) f(x) = 

d) /( x) = sin(xi + x 2 ) e) f(x) = sin(xi + ax 2 ) 



Exercise 3.21 Build a function / : M 2 — » M, which generates roughly the following graph: 




Plot3D[f [x,y] , {x,-5,5>,-[y,-5,5>, 
PlotPoints -> 30] 



Exercise 3.22 

Exercise 3.23 
Exercise 3.24 




ContourPlot [f [x , y] , fx , -5 , 5} , {y , -5 , 5} , 
PlotPoints -> 60, ContourSmoothing -> 
True,ContourShading-> False] 



Calculate the derivative matrix of the function f(xi, x 2 , x 3 ) = 

For f(x,y)= ( . , A ), find the tangent plane at x 0 = 

\ sm(e + e y ) J 

Draw the graph of the function 

f(x,y) = 



\jX\X 2 X Z 

sin(xia:2X3) 



y( 1 + cosy) for 1 2/ 1 > |x| 



0 



else 



Show that / is continuous and partially differentiable in M 2 , but not in 0. 

X 2 -)- y ^ 

Exercise 3.25 Calculate the gradient of the function f(x,y) = — and draw it as 

6 JK l + x 4 + y 4 

an arrow at different places in a contour lines image of /. 

Exercise 3.26 The viscosity 7] of a liquid is to be determinated with the formula K = 6nr]vr. 
Measured: r = 3cm, v = 5cm/sec, K = lOOOdyn. Measurement error: |Ar| < 0.1cm, 
|An| < 0.003cm/sec, |A K\ < O.ldyn. Determine the viscosity y and its error Ay. 




68 



3 Calculus - Selected Topics 



Extrema 

Exercise 3.27 Examine the following function for extrema and specify whether it is a local, 
global, or an isolated extremum: 

a) }{x,y) = x 3 y 2 (l-x-y) 

b) g{x,y) = x k + (x + y ) 2 (A: = 0,3, 4) 

Exercise 3.28 Given the function / : M 2 — > M, f(x,y) — (y — x 2 )(y — 3x 2 ). 

a) Calculate grad / and show: grad/ (a;, y) = 0 43- x = y = 0. 

b) Show that (Hess/)(0) is semi-definite and that / has a isolated minimum on each line 
through 0. 

c) Nevertheless, / has not an local extremum at 0 (to be shown!). 

Exercise 3.29 Given the functions <&(x, y) = y 2 x — x 3 , f(x, y) = x 2 + y 2 — 1. 

a) Examine <£> for extrema. 

b) Sketch all contour lines h — 0 of <f>. 

c) Examine <f> for local extrema under the constraint f(x,y) = 0. 

Exercise 3.30 The function 



f(x,y) 



sin( 2x 2 + 3 y 2 
x 2 + y 2 



has at (0,0) a discontinuity. This can be remedied easily by defining e.g. /(0, 0) := 3. 

a) Show that / is continuous on all M 2 except at (0,0). Is it possible to define the function 
at the origin so that it is continuous? 

b) Calculate all local extrema of the function / and draw (sketch) a contour line image (not 
easy) . 

c) Determine the local extrema under the constraint (not easy): 

i) x = 0.1 

ii) y = 0.1 

iii) x 2 + y 2 = 4 



Exercise 3.31 Show that grad (f g) = ggrad/ + /gradg. 




Chapter 4 

Statistics and Probability Basics 



Based on samples, statistics deals with the derivation of general statements on certain 

features. 

i 



4.1 Recording Measurements in Samples 

Discrete feature: finite amount of values. 

Continuous feature: values in an interval of real numbers. 



I Definition 4.1 Let X be a feature (or random variable). A series of measurements 
Xi, ... ,x n for X is called a sample of the length n. 

Example 4.1 For the feature X (grades of the exam Mathematics I in WS 97/98) following 
sample has been recorded: 

1.0 1.3 2.2 2.2 2.2 2.5 2.9 2.9 2.9 2.9 2.9 2.9 2.9 3.0 3.0 3.0 3.3 3.3 3.4 3.7 3.9 3.9 4.1 4.7 
Let g(x ) be the absolute frequency of the value x. Then 

h(x) = -g(x) 
n 

is called relative frequency or empirical density of X. 



Grade X 


Absolute frequency g(x) 


Relative frequency h(x) 


1.0 


1 


0.042 


1.3 


1 


0.042 


2.2 


3 


0.13 


2.5 


1 


0.042 


2.9 


7 


0.29 


3.0 


3 


0.13 


3.3 


2 


0.083 


3.4 


1 


0.042 


3.7 


1 


0.042 


3.9 


2 


0.083 


4.1 


1 


0.042 


4.7 


1 


0.042 



1 The content of this chapter is strongly leaned on [?]. Therefore, [?] is the ideal book to read. 
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If x\ < X 2 < ■ ■ ■ x n , then 

H(x) = ^ h(t) 

t<x 

is the empirical distribution function. 

It is apparent from the data that 8.3 % of the participating students in the exam Mathematics 
1 in WS 97/98 had a grade better than 2.0. 

On the contrary, the following statement is an assumption: In the exam Mathematics 1, 8.3 
% of the students of the HS RV-Wgt achieve a grade better than 2.0. This statemtent is a 
hypothesis and not provable. 

However, under certain conditions one can determine the probability that this statement is 
true. Such computations are called statistical induction. 



Empirische Dichte Empirische Verteilungsfunktion 




When calculating or plotting empirical density functions, it is often advantageous to group 
measured values to classes. 

Example 4.2 

Following frequency function has been determined from runtime measurements of a random- 
ized program (automated theorem prover with randomized depth-first search and backtrack- 
ing): 




0 10000 20000 30000 40000 50000 60000 

Laufzeit f j 



In this graphic, at any value ti G {1, . . . , 60000} a frequency in the form of a histogram is 
shown. One can clearly see the scattering effects due to low frequencies per time value f*. In 
the next image, 70 values each have been summarized to a class, which results in 600 classes 
overall. 







4.2 Statistical Parameters 
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0 10000 20000 30000 40000 50000 60000 

Laufzeit f j 



Summarizing 700 values each to a class one obtains 86 classes as shown in the third image. 
Here, the structure of the frequency distribution is not recognizable anymore. 



Hufigkeiten flj der sequentiellen Laufzeiten f j 




The amount £ of the classes should neither be choosen too high nor too low. In [?] a rule of 
thumb £ < -y/n is given. 



4.2 Statistical Parameters 

The effort to describe a sample by a single number is fullhlled by following definition: 



Definition 4.2 For a sample x\, X 2 , ■ ■ ■ x n the term 

1 

x = — 
n 

is called arithmetic mean and if X\ < X 2 < ■ ■ ■ x n , then the sample median is defined 

as 

f Xn+i if n odd 

0C — \ 2 

1 l (^f + x f+i) if n even 




In the example 4.2, the arithmetic mean is marked with the symbol A. It is interesting that 
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the arithmetic mean minimizes the sum of squares of the distances 



n 






i— 1 



whereas the median minimizes the sum of the absolut values of the distances 



n 




i = 1 



(proof as exercise). Often, one does not only want to determine a mean value, but also a 
measure for the mean deviation of the arithmetic mean. 

Definition 4.3 The number 



If not only grades from Mathematics 1, but for any student also the grades of Mathematics 
2 and further courses are considered, one can ask if there is a statistical relationship between 
the grades of different courses. Therefore, a simple tool, the covariance matrix is introduced. 
For a multidimensional variable (Xi, X 2 , . . . , X^), a fc-dimensional sample of the length n 
consists of a list of vectors 




1=1 



is called sample variance and 




is called standard deviation 



4.3 Multidimensional Samples 



(iCll, X2I1 • • • , Xkl), {x\2, X22, ■ ■ ■ , Xk2), ■ ■ ■ , ( Xin , X2 n , ■ ■ ■ , Xkn ) 



By extension of example 4.1, we obtain an example for 2 dimensions. 



Example 4.3 




4.3 Multidimensional Samples 
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1 1.5 2 



2.5 3 

X 



3.5 4 4.5 



0.2 



0.4 0.6 

X 



0.8 



For the equally distributed random numbers in the right plot a xy = 0.0025 is determined. 
Thus, the two variables have a very low correlation. 

If there are k > 2 variables, the data cannot easily be plotted graphically. But one can deter- 
mine the covariances between two variables each in order to represent them in a covariance 
matrix cr: 

1 " 

y ]( Xu - Xj){x jt - Xj) 



a ij 



n 



t=\ 



If dependencies among different variables are to be compared, a correlation matrix can 
be determined: 



K ■ = ° ij 

1 *-i 7 

Si ■ Si 



Here, all diagonal elements have the value 1. 

Example 4.4 In a medical database of 473 patients 2 with a surgical removal of their ap- 
pendix, 15 different symptoms as well as the diagnosis (appendicitis negative/positive) have 
been recorded. 



2 The data was obtained from the hospital 14 Nothelfer in Weingarten with the friendly assistance of Dr. 
Rampf. Mr. Kuchelmeister used the data for the development of an expert system in his diploma thesis. 
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Alter : 


continuo 


gender_(l=m 2=w) : 


1,2. 


pain_quadrantl_ (0=nein l=ja) : 


0,1. 


pain_quadrant2_ (0=nein l=ja) : 


0,1. 


pain_quadrant3_ (0=nein l=ja) : 


0,1. 


pain_quadrant4_ (0=nein l=ja) : 


0,1. 


guarding_(0=nein l=ja) : 


0,1. 


rebound_tenderness_ (0=nein l=ja) : 


0,1. 


pain_on_tapping_(0=nein l=ja) : 


0,1. 


vibration_ (0=nein l=ja) : 


0,1. 


rectal_pain_(0=nein l=ja) : 


0,1. 


temp_ax : 


continuous . 


temp_re : 


continuous . 


leukocytes : 


continuous . 


diabetes_mellitus_ (0=nein l=ja) : 


0,1 


appendicitis_(0=nein l=ja) : 


0,1 



The first 3 data sets are as follows: 



26 


l 


0 


0 


1 


0 


1 


0 1 1 


0 


37.9 


38.8 


23100 


17 


2 


0 


0 


1 


0 


1 


0 1 1 


0 


36.9 


37.4 


8100 


28 


1 


0 


0 


1 


0 


0 


0 0 0 


0 


36.7 


36.9 


9600 


The correlation matrix for the data of all 473 patients 


is: 





1 . 


- 0.009 


0.14 


0.037 


- 0.096 


0.12 


0.018 


0.051 


- 0.034 


- 0.041 


0.034 


0.037 


0.05 


- 0.037 


0.37 


0.012 


- 0.009 


1 . 


- 0.0074 


- 0.019 


- 0.06 


0.063 


- 0.17 


0.0084 


- 0.17 


- 0.14 


- 0.13 


- 0.017 


- 0.034 


- 0.14 


0.045 


- 0.2 


0.14 


- 0.0074 


1 . 


0.55 


- 0.091 


0.24 


0.13 


0.24 


0.045 


0.18 


0.028 


0.02 


0.045 


0.03 


0.11 


0.045 


0.037 


- 0.019 


0.55 


1 . 


- 0.24 


0.33 


0.051 


0.25 


0.074 


0.19 


0.087 


0.11 


0.12 


0.11 


0.14 


- 0.0091 


- 0.096 


- 0.06 


- 0.091 


- 0.24 


1 . 


0.059 


0.14 


0.034 


0.14 


0.049 


0.057 


0.064 


0.058 


0.11 


0.017 


0.14 


0.12 


0.063 


0.24 


0.33 


0.059 


1 . 


0.071 


0.19 


0.086 


0.15 


0.048 


0.11 


0.12 


0.063 


0.21 


0.053 


0.018 


- 0.17 


0.13 


0.051 


0.14 


0.071 


1 . 


0.16 


0.4 


0.28 


0.2 


0.24 


0.36 


0.29 


- 0.00013 


0.33 


0.051 


0.0084 


0.24 


0.25 


0.034 


0.19 


0.16 


1 . 


0.17 


0.23 


0.24 


0.19 


0.24 


0.27 


0.083 


0.084 


- 0.034 


- 0.17 


0.045 


0.074 


0.14 


0.086 


0.4 


0.17 


1 . 


0.53 


0.25 


0.19 


0.27 


0.27 


0.026 


0.38 


- 0.041 


- 0.14 


0.18 


0.19 


0.049 


0.15 


0.28 


0.23 


0.53 


1 . 


0.24 


0.15 


0.19 


0.23 


0.02 


0.32 


0.034 


- 0.13 


0.028 


0.087 


0.057 


0.048 


0.2 


0.24 


0.25 


0.24 


1 . 


0.17 


0.17 


0.22 


0.098 


0.17 


0.037 


- 0.017 


0.02 


0.11 


0.064 


0.11 


0.24 


0.19 


0.19 


0.15 


0.17 


1 . 


0.72 


0.26 


0.035 


0.15 


0.05 


- 0.034 


0.045 


0.12 


0.058 


0.12 


0.36 


0.24 


0.27 


0.19 


0.17 


0.72 


1 . 


0.38 


0.044 


0.21 


- 0.037 


- 0.14 


0.03 


0.11 


0.11 


0.063 


0.29 


0.27 


0.27 


0.23 


0.22 


0.26 


0.38 


1 . 


0.051 


0.44 


0.37 


0.045 


0.11 


0.14 


0.017 


0.21 


- 0.00013 0.083 


0.026 


0.02 


0.098 


0.035 


0.044 


0.051 


1 . 


- 0.0055 


0.012 


- 0.2 


0.045 


- 0.0091 


0.14 


0.053 


0.33 


0.084 


0.38 


0.32 


0.17 


0.15 


0.21 


0.44 


- 0.0055 


1 . 



The matrix structure is more apparent if the numbers are illustrated as density plot 3 In 
the left diagram, bright stands for positive and dark for negative. The right plot shows the 
absolute values. Here, white stands for a strong correlation between two variables and black 
for no correlation. 

3 The first to images have been rotated by 90°. Therefore, the fields in the density plot correspond to the 
matrix elements. 
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It is clearly apparent that most of the variable pairs have no or only a very low correlation, 
whereas the two temperature variables are highly correlated. 



4.4 Probability Theory 

The purpose of probability theory is to determine the probability of certain possible events 
within an experiment. 

Example 4.5 When throwing a die once, the probability for the event „ throwing a six” is 
1/6, whereas the probability for the event „ throwing an odd number” is 1/2. 



Definition 4.4 Let 12 be the set of possible outcomes of an experiment. Each c U G 12 
stands for a possible outcome of the experiment. If the w l G 12 exclude each other, but 
cover all possible outcomes, they are called elementary events. 



Example 4.6 When throwing a die once, 11 = {1, 2, 3, 4, 5, 6}, because no two of these events 
can occur at the same time. Throwing an even number {2, 4, 6} is not an elementary event, as 
well as throwing a number lower than 5 {1, 2, 3, 4}, because {2, 4, 6}fl{l, 2, 3, 4} = {2, 4} A 0. 



Definition 4.5 Let 12 be a set of elementary events. A — 0, — A — {co G 12|u; ^ A} is 
called the complementary event to A. A subset A of 2^ is called event algebra over 
12, if: 

1. 12 G A 

2. With A, A is also in A. 

3. If (A n ) neN is a sequence A , then U™ =l A n is also in A. 



Every event algebra contains the sure event 12 as well as the impossible event 0. 
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At coin toss, one could choose A = 2 n and £7 = {1, 2, 3, 4, 5, 6}. Thus A contains any possible 
event by a toss. 

If one is only interested in throwing a six, one would consider A = {6} and A = {1, 2, 3, 4, 5} 
only, where the algebra results in A = {0, A, A, £7}. 

The term of the probability should give us an as far as possible objective description of our 
„believe” or „conviction” about the outcome of an experiment. As numeric values, all real 
numbers in the interval [0, 1] shall be possible, whereby 0 is the probability for the impossible 
event and 1 the probability for the sure event. 



4.4.1 The Classical Probability Definition 

Let £7 = {u>i,uj 2 , ■ ■ ■ , ca n } be finite. No elementary event is preferred, that means we assume 
a symmetry regarding the frequency of occurence of all elementary events. The probability 
P(A) of the event A is defined by 

|A| Amount of outcomes favourable to A 
| £7 1 Amount of possible outcomes 

It is obvious that any elementary event has the probability 1/n. The assumption of the same 
probability for all elementary events is called the Laplace assumption. 



Example 4.7 Throwing a die, the probability for an even number is 



P({2,4,6}) 



Kg, 4,6}j 
I { 1 , 2, 3, 4, 5, 6} | 




1 

2 ' 



4.4.2 The Axiomatic Probability Definition 

The classical definition is suitable for a finite set of elementary events only. For endless sets 
a more general definition is required. 



Definition 4.6 Let £7 be a set and A an event algebra on £7. A mapping 

P : A -> [0,1] 

is called probability measure if: 

1. P(£7) = 1. 

2. If the events A n of the sequence (A n ) ne pj are pairwise inconsistent, i.e. for i,j e N 
it holds Ai D Aj = 0, then 

( oo \ oo 

LM< = 

i = 1 / i = 1 

For A £ A, P(A) is called probability of the event A. 



From this definition, some rules follow directly: 





4.4 Probability Theory 
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Theorem 4.1 

1. P(0) = 0, i.e. the impossible event has the probability 0. 

2. For pairwise inconsistent events A and B it holds P(A U B) = P(A) + P{B). 

3. For a finite amount of pairwise inconsistent events A±, A 2 , ■ ■ ■ A/- it holds 

( k \ k 

LJA. = £e(A,). 

i= 1 ) i= 1 

4. For two each other complentary events A and A it holds P(A) + P(A) = 1. 

5. For any event A and B it holds P(A U B) — P(A) + P(B) — P(A fl B ). 

6. For A C B it holds P(A) < P(B). 



Proof: as exercise. 

4.4.3 Conditional Probabilities 

Example 4.8 In the Doggenriedstrafie in Weingarten the speed of 100 vehicles is measured. 
At each measurement it is recorded if the driver was a student or not. The results are as 
follows: 



Event 


Frequency 


Relative frequency 


Vehicle observed 


100 


1 


Driver is a student (5) 


30 


0.3 


Speed too high (G) 


10 


0.1 


Driver is a student and speeding ( S H G ) 


5 


0.05 



We now ask the following question: Do students speed more frequently than the average 
person, or than non- students? 1 The answer is given by the probability P(G\S) for speeding 
under the condition that the driver is a student. 



P(G\S) 



(Driver is a student and speeding) 
(Driver is a student) 



5 

30 



1 

6 



Definition 4.7 For two events A and B, the probability for A under the condition B 

(conditional probability) is defined by 



P(A\B) 



P(A n B) 
P(B) 



4 The determined probabilities can only be used for further statements if the sample (100 vehicles) is 
representative. Otherwise, one can only make a statament about the observed 100 vehicles. 
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At example 4.8 one can recognize that in the case of a finite event set the conditional 
probability P(A\B) can be treated as the probability of A, when regarding only the event 



B , i.e. as 



P(A\B) 



\AnB 

\B\ 



Definition 4.8 If two events A and B behave as 

P(A\B) = P(A), 

then these events are called independent. 



A and B are independent, if the probability of the event A is not influenced by the event B. 



Theorem 4.2 From this definition, for the independent events A and B follows 

P(AnP) = P(A)-P(B) 

Beweis: Proof: 

P(A\B) = P{ p { gP = P(A) => P(A n B) = P(A) • P(B) 

Example 4.9 The probability for throwing two sixes with two dice is 1/36 if the dice are 
independent, because 

P(die 1 = six) • P(die 2 = six) = - • - = — 
v ; v ; 6 6 36 

= P(die 1 = six fl die 2 = six), 

whereby the last equation applies only if the two dice are independent. If for example by 
magic power die 2 always falls like die 1, it holds 

P(die 1 = six fl die 2 = six) = -. 

6 



4.4.4 The Bayes Formula 



Since equation (4.7) is symmetric in A and P, one can also write 



P(A\B) 



P(AnB) 
P{B ) 



as well as P(B\A) 



P(AnB) 

P{A) 



Rearranging by P(A fl B) and equating results in the Bayes formula 



P (a\b) = nmim , 



A very reliable alarm system warns at burglary with a certainty of 99%. So, can we infer 
from an alarm to burglary with high certainty? 

No, because if for example P(A\B) = 0.99, P(A) = 0.1, P(B) = 0.001 holds, then the 
Bayes formula returns: 



P(B \A) 



P(A\B)P(B ) 

P(A) 



0.99-0.001 



0 . 01 . 



0.1 
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4.5 Discrete Distributions 



Definition 4.9 A random variable, which range of values is finite or countably infinite is 
called discrete random variable. 



Example 4.10 Throwing a die, the number X is a discrete random variable with the values 
{ 1 , 2, 3 , 4 , 5, 6}, this means in the example it holds x\ — 1 , . . . , Xq — 6 . If the die does not 
prefer any number, then 

Pi = P( X = Xi) = 1/6, 

this means the numbers are uniformly distributed. The probability to throw a number < 5 

is 

P(x < 5) = Pi = 5/6. 

i:Xi< 5 



In general, one defines 



Definition 4.10 The function, which assigns a probability p t to each x t of the random 
variable X is called the discrete density function of X. 



Definition 4.11 For any real number x, a defined function 

x t— > P( X < x) — 

i:Xi< x 

is called distribution function of X. 

Such as the empirical distribution function, P(X < x) is a monotonically increasing step 
function. Analogous to the mean value and variance of samples are the following definitions. 

Definition 4.12 The number 

E(X) = Yx iPi 

i 

is called expected value. The variance is given by 

Var(X) := E((X - E(X)) 2 ) = - E{X)f Pi 

i 

whereby ^Var(x) is called standard deviation. 



It is easy to see that Var(X) := E(X 2 ) — E( X) 2 (exercise). 
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4.5.1 Binomial Distribution 

Let a player’s scoring probability at penalty kicking be = 0.9. The probability always to 
score at 10 independent kicks is 

£io,o.9(10) = 0.9 10 « 0.35. 

It is very unlikely that the player scores only once, the probability is 

B 10 , 0.9(1) = 10 ■ 0.1 9 ■ 0.9 = 0.000000009 

We might ask the question, which amount of scores is the most frequent at 10 kicks. 



Definition 4.13 The distribution with the density function 
is called binomial distribution. 



Thus, the binomial distribution indicates the probability that with n independent tries of a 
binary event of the probability p the result will be x times positive. Therefore, we obtain 

£io,o. 9 (fc)= ( 55 ) 0.1 fc • 0.9 n-fc 

The following histograms show the densities for our example for p = 0.9 as well as for p = 0.5. 





For the binomial distribution it holds 

n 

E(X) = J2 x ■ (") F't 1 - P)”~' = "P 

x=0 



and 



Var(X) = np( 1 — p). 




4.6 Continuous Distributions 
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4.5.2 Hypergeometric Distribution 

Let N small balls be placed in a box. K of them are black and N — K white. When drawing 
n balls, the probability to draw x black is 



Hn,k,u(x) 



N-K 
n — x 




The left of the following graphs shows # 100 , 30,10 (^Oj the right one H N03N i 0 (x). This cor- 
responds to N balls in the box and 30% black balls. It is apparent, that for N — 10 the 
density has a sharp maximum, which becomes flatter with N > 10. 




H (x,N, 0 . 3N, 10) 




As expected, the expected value of the hypergeometric distribution is 

E(X) = n- ,X 



4.6 Continuous Distributions 



Definition 4.14 A random variable X is called continuous, if its value range is a subset 
of the real numbers and if for the density function / and the distribution function F it 
holds 

/ X 

f(t)dt. 

-OO 

With the requirements P(f2) = 1 and P(0) = 0 (see def. 4.6) we obtain 

lim F(x) = 0 sowie lim F(x) = 1. 

x — » — OO x^oo 



4.6.1 Normal Distribution 

The most important continuous distribution for real applications is the normal distribu- 
tion with the density 



1 



(* - 
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Theorem 4.3 For a normally distributed variable X with the density it holds E(X) = 
/i and Var(X) = a 2 . 



For n = 0 and a = 1 one obtains 
the standard normal distribu- 
tion With a — 2 one ob- 

tains the flatter and broader den- 
sity <p 0 , 2 . 




Example 4.11 Let the waiting times at a traffic light on a country road at lower traffic be 
uniformly distributed. We now want to estimate the mean waiting time by measuring the 
waiting time T 200 times. 



The empirical frequency of the 
waiting times is shown opposite 
in the image. The mean value (•) 
lies at 60.165 seconds. The fre- 
quencies and the mean value in- 
dicate a uniform distribution of 
times between 0 und 120 sec. 



Haeuf igkeiten der Wartezeiten (40 Klassen) 




Due to the finiteness of the sample, the mean value does not lie exactly at the expected 
value of 60 seconds. We now might ask the question, if the mean value is reliable, more 
precise with what probability such a measured mean differs from the expected value by 
a certain deviation. This will be investigated regarding the mean value from 200 times as 
random variable while recording a sample for the mean value. For example, we let 200 people 
independently measure the mean value from 200 records of the waiting time at a traffic light. 
We obtain the following result: 



0 . 15 : 



The empirical density function 
of the distribution of the mean 
value t shows a clear maximum 
at t = 60 seconds while steeply 
sloping at the borders at 0 and 
120 seconds. It looks like a 
normal distribution. 



0 . 125 
0 . 1 
0 . 075 
0 . 05 
0 . 025 



52.5 55 57.5 60 62.5 65 67.5 



The kind of relation between the distribution of the mean value and the normal distribution 
is shown by the following theorem: 




4.6 Continuous Distributions 
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Theorem 4.4 (Central Limit Theorem) If X\, X 2 , ■ ■ ■ , X n are independent identically 
distributed random variables with cr(Xi) < 00 and 

S n = X 1 + ... + X n , 

then S n tends (for n — > 00) to a normal distribution with the expected value nE(X 1) and 
the standard deviation of yjno. It holds 

lim SUp{|S n (x) - <PnE{Xx),y/H*{X i)(^)l : x e M} = 0 . 

n— >00 v v y 

This theorem has some important conclusions: 

• The sum of independent identically distributed random variables asymptotically tends 
to a normal distribution. 



• The mean of the n independent measurements of a random variable is approximately 
normally distributed. The approximation holds better, the more measurements are 
made. 



• The standard deviation of a sum Xi + . . . + X n of identically distributed random 
variables is equal to \fna{X 1 ). 

Example 4.12 

The following diagram shows the (exact) distribution of the mean calculated from n i.i.d. 
(independent identically distributed) discrete variables, each uniformly distributed: p(0) = 
p(l) = p(2) = p(3) = p(4) = 1/5. 



Distribution of mean of uniform i.i.d. var. 




With the help of the central limit theorem we now want to determine the normal distribution 
of the mean value from example 4.11 in order to compare it with the empirical density of 
the mean value. The mean value t n after n time measurements is 



tn 

n 






i=l 



Following theorem 4.4, the sum normally distributed and has the density 



^ nE(X\) , s /rur\-^) 



exp 



71 \ TUT 



{x - nE{T)¥ 
2na 2 



The mean value t n has the density ■ 5 The variance a 2 of the uniform distribution 



5 This is given by the following, easy to proof property of the variance: Var(X/n) = 1 /n 2 Var(X). 
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is still missing. 



Definition 4.15 The density of the uniform distribution over the interval (a, b) 

(also called rectangular distribution) is 



f(x) 



7- 1 — if a < x < b 

b—a — — 

0 if sonst 



One calculates 

E(X) 

Var(X) 




E(X 2 ) - E(Xf = 

b — a 




f a + b 



2 



( b — a) 2 
12 



(4.1) 

(4.2) 



Therefore, for the example one calculates 

cr (b — a) 120 j- 

y/n ~ \/l2n _ a/12 • 200 _ 

Thus, the density of the mean value of the traffic light waiting times should be approximated 
well by <^ 60 Y6 as it can be seen in the following image. 





0.15 




0.125 


Density function of the dis- 
tribution of the mean value 


0.1 


with the density of the nor- 


0.075 


mal distribution <^ 60 v ^. 


0.05 




0.025 




Since we now know the density of the mean value, it is easy to specify a symmetric interval 
in which the mean value (after our 200 measurements) lies with a probability of 0.95. In the 
image above (cp 60 yg) we have to determine the two points U\ and it 2 , which behave 



P(u i <t< w 2 ) 




<A>o,v/6(t) dt = °- 95 



Because of 




it must behave 




0.025 und 



ru 2 



^ 60 , x/ 6 ^) dd 



0.975. 
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Graphically, we can find the two points u \ , w 2 , searching for the x values to the level 0.025 
and 0.975 in the graph of the distribution function of the normal distribution 

px 



^eo.Vefa) = P{X <x)= Veorfit) dt 



From the image on the op- 
posite we read out 

Mi 55.2, u 2 ~ 64.8. 




u 1 



56 



58 



60 



62 



64 



u2 



66 



We now know the following: After our sample of 200 time measurements the expexted value 
of our waiting time t lies in the interval [55.2,64.8] with a probability of 0.95. 6 This interval 
is called the confidence interval to the level 0.95. 

In general, the confidence interval [mi,m 2 ] to the level 1 — a has the following meaning. 
Instead of estimating a paramater 0 from sample measurements, we can try to determine 
an interval, that contains the value of 0 with high probability. For a given number a (in the 
example above, a was 0.05) two numbers u\ and w 2 are sought which behave 

P{u\ < 0 < u 2 ) = l-o. 

Not to be confused with the confidence interval are the quantiles of a distribution. 



Definition 4.16 Let X be a continuous random variable and 7 G (0,1). A value x 1 is 

called 7 -quantile, if it holds 

P(X < x 7 ) = f fit) dt = 7 . 

J — OO 

The 0.5 quantile is called median. 



4.7 Exercises 

Exercise 4.1 

6 This result is only exact under the condition that the standard deviation a of the distribution of t is 
known. If a is unknown too, the calculation is more complex. 
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a) Show that the arithmetic mean x = — > x t minimizes the sum of the squared distances 

n 

i= 1 
n 

J2( x i~ x ) 2 - 

i= 1 

b) Show that the median 

Xn+i if n odd 

xrk + Xn+i ) if n even 
. 2 2 / 

n 

minimizes the sum of the absolute values of the distances \xi — x\. (Hint: consider by 

2—1 

n 

an example how \xi — x\ is going to change if x deviates from the median.) 

i=i 

Exercise 4.2 As thrifty, hard-working Swabians we want to try to calculate whether the 
German lottery is worth playing. In German lottery, 6 balls are drawn out of 49. The 49 
balls are numbered from 1-49. A drawn ball is not put back into the pot. In each lottery 
ticket held, the player chooses 6 numbers out of 49. 

a) Calculate the number of possible draws in the lottery (6 of 49 / Saturday night lottery), 
which result in having (exactly) three correct numbers. Then, what is the probability 
to have three correct numbers? 

b) Give a formula for the probability of achieving n numbers in the lottery. 

c) Give a formula for the probability of achieving n numbers in the lottery with the bonus 
number (the bonus number is determined by an additionally drawn 7th ball). 

d) What is the probability that the (randomly) drawn ’’super number” (a number out of 
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}) equals the last place of the serial number of the lottery ticket? 

e) Calculate the average lottery prize if the following sums are payed out (s.n.: super 
number, b.n.: bonus number): 




Winning class 


I 


II 


III 


IV 


V 


VI 


VII 


Correct numbers 


6 with s.n. 


6 without s.n. 


5 with b.n. 


5 


4 


3 with b.n. 


3 


Prize (6.12.1997) 


4.334.833,80 


1.444.944,60 


135.463,50 


10.478,20 


178,20 


108,70 


11,00 


Prize (29.11.1997) 


12.085.335,80 


1.382.226,80 


172.778,30 


12.905,90 


192,30 


82,30 


12.10 


Prize (22.11.1997) 


7.938.655,30 


3.291.767,70 


141.075,70 


11.018,40 


157,50 


79,20 


10,10 


Prize (15.11.1997) 


3.988.534,00 


2.215.852,20 


117.309,80 


9.537,30 


130,70 


60,80 


8,70 


Prize (8.11.1997) 


16.141.472,80 


7.288.193,60 


242.939,70 


14.798,30 


190,10 


87,70 


10,90 



Exercise 4.3 Show that for the variance the following rule holds 

Var(X ) = E(X 2 ) - E(X) 2 . 



Exercise 4.4 

a) For pairwise inconsistent events A and B it holds P(A U B) — P(A) + P(B). (Hint: 
consider, how the second part of definition 10.6 could be applied on (only) 2 events.) 

b) P(0) = 0, i.e. the impossible event has the probability 0. 

c) For two complementary events A and A it holds P(A) + P(A) = 1. 





4.7 Exercises 



87 



d) For arbitrary events A and B it holds P(A U B) = P(A) + P(B) — P(A fl B). 

e) For A C B it holds P(A) < P(B). 



Exercise 4.5 Give an example for an estimator with 0 variance. 
Exercise 4.6 Show that for the sample variance it holds: 

1 



s 2 = 



n 



S' 



1 

3 = 1 



n y 



n 



n 



ix-fi) . 




Chapter 5 

Numerical Mathematics 
Fundamentals 

5.1 Arithmetics on the Computer 

5.1.1 Floating Point Numbers 

The set of floating point numbers to base /3, with t fractional digits and exponents between 
m and M, can be formally defined by 

F(/3, t, m, M) — {d : d — ±.did 2 ■ ■ ■ d t ■ ft} U {0} C Q 

with 



P G N 

0 < di < P — 1 di : digits, d\ ^ 0 
di, o? 2 , . . . , d t : mantissa 
t : mantissa length 

e : exponent with m < e < M m,MeZ 
The floating point number ±.did 2 . . . d t ■ has the value 

d = ± (ch/3 6 - 1 + d 2 p e ~ 2 + • • • + d t p e ~ t ) 

Example 5.1 Let P — 2 and t — 3 given, that means we consider three-digit numbers in 
the binary system. The number 0.101 • 2 21 has the value 

0.101 • 2 21 = 1 • 2 20 + 0 • 2 19 + 1 • 2 18 = 2 20 + 2 18 . 

In the decimal system with p = 10 we need a six-digit mantissa (t = 6), to represent this 
number: 

2 2° + 2 is = 1310 720 = 0.131072 • 10 7 . 



5. 1.1.1 Distribution of F(P,t,m,M) 

\F(P,t,m,M)\ = - m + l) (ft - P^-V) + 1 

0 



exponents mantissas 
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Example 5.2 F(2,3, — 1,2) 
with the upper formula we get: 

|F(2, 3, -1, 2)| = 2(4) (2 3 - 2 2 ) + 1 = 33 
there are only the “0“ and 32 different numbers between 

±0.100 • 2 _1 , the number with smallest absolute value 
±0.111 • 2 2 , the number with largest absolute value 

The elements > 0 of F(2,3,-l,2) are 

1 5 3 7153 7 53 7 5 7 

°’ 4’ 16’ 8’ 16’ 2’ 8’ 4’ 8’ 4’ 2’ 4’ 2 ’ 2’ 3 ’ 2 
Distribution on the number line: 

gap at zero 

| » » » * i + i + 1 + 1 + 1 + 1 | 

0 1/4 1/2 1 2 3 4 

problems: 

• Exponent overflow 

• Exponent underflow 

• Round-off error 

5.1.2 Round-off Errors 

5. 1.2.1 Round-off and Truncation Errors (absolute) 

Definition 5.1 fl c ,fl r : [— O.a . . . a ■ (5 M , O.a . . . a ■ P M ] — ■> F((3,t,m,M ) 
(5-1 

Round-off: x i — > fi r (x) = nearest neighbor of x in F(/3, t, m, M ) 
Truncate: x ^ fl c (^) = max {y e F((3, t, m, M)\y < a;} 

It holds: 

absolute value Round-off Errors = ffl r (x) — x\ < 
absolute value Truncation Error = |fl c (a;) — x\ < (5 e ~ l 

2stellige Mantisse 

Example 5.3 / 3 = 10 , t — 2 , 

10er System Exponent 

x = 475 

fl r (x) = 0.48 • 10 3 



with a = 



round-off 
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fl c (a;) = 0.47 • 10 3 truncate 

|fl r (x) - x\ = |480 - 475 1 = 5 < ^ • 10 3 " 2 = 5 
|fl c (x) -x\ = |470 - 475 1 = 5 < 10 3 " 2 = 10 



5. 1.2. 2 Round-off and Truncation Errors (relative) 



|flr(a) 



X\ 



FI 

|fl c (ff) ~ x 
FI 



< 



<13 



1 -t 



Example 5.4 relative round-off error 

1480 -4751 = 1 < 1 10 _! = 1 
|475| 95 “ 2 20 

|110 - 105 1 11 

jl05| “ 21 < 20 

Example 5.5 t=3, (3 = 10 

110 • 105 = 11550 ^ 11600 = fl r (11550) 

Achtung: 

Field axioms violated! 

F(/3,t,m, M) is not closed w.r.t. multiplication. 
Let ~k E {+, — , •, div} 



— > upper bound for the smallest 
number! 

For hxed number of digits, the rel- 
ative error gets bigger for smaller 
numbers! 



3x, y E F((3, t, m, M ) : fl r (a; * y) ^ x-ky 



5.1.3 Cancellation 

Example 5.6 Let (3 — 10 and t — 8 



a = 
b = 
c = 
a + b + c = 

fir (hr (a + b) + c) = 

fl r (a + fl r (6 + c)) = 
fl r (fl r (a + c) + b) = 



0.1 • 10 9 

0.1 • 10 1 

- 0.1 • 10 9 
0.1 • 10 1 = 1 

0.1 • 10 9 - 0.1 • 10 9 = 0 

0.1 • 10 9 - 0.1 • 10 9 = 0 

0 + 0.1 • 10 1 = 1 



Associative law is not valid in F(/3,t,m, M) 
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5.1.4 Condition Analysis 

Example 5.7 Solve the linear system 



x + ay = 1 
ax + y = 0 



x — a 2 x 
x 



1 

, 1 , fur a % ±1 
1 — a 2 



a = 1.002 = exact value 

a = 1.001 = measurement or rounding-off error 



relative error: 



a — a 
a 



1 

1002 



solution: 



relative error 



1 



x 



X 



0.004 

1 

0.002 



x — X 


r^> 


-250 


X 




249.75 



-249.75 
-499.75 
= 1.001 



See Figure 5.1. 



(100% error) 



X 




Figure 5.1: Gain of the input error under ill-condition. 



Matrix A 



1 a 
a 1 



is singular for a 



1, i.e. 



1 1 
1 1 



0 
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Definition 5.2 Let P be the problem to calculate the function f(x) with given input x. 
The condition number C p is the factor by which a relative error — in the input / will be 
increased, i.e. 



f(x + Ax) - f(x) 


— C 


Ax 


f(x) 


L/p 


X 



It holds: 



(f(x + Ax) - f(x))/f(x) 


r^> 


f '( X ) x 


Ax/x 




m 



Example 5.8 Calculation of C p 



x = / 0 ) = 



/'(«) = 



C p « 



2 a 

(1 -a 2 ) 2 



(1 — a 2 )a 





2 a 2 




1 — a 2 



2 a 

(1 - a 2 ) 2 
= 501.5 



direct calculation (see above): C p ~ 1002 
Factor 2 due to linearization of / in a! 



I Definition 5.3 A problem is ill-conditioned (well-conditioned) if C p 1 
(C p < 1 oder C p ~ 1) 

Note: C p depends on the input data! 

5.2 Numerics of Linear Systems of Equations 

see [1] 

5.2.1 Solving linear equations (GauB’ method) 

Linear System Ax = b: 

CL nXi + 012^2 + ' ' ' + CL\ n X n = b\ 

CL2lXl + CL22%2 + ' ' ' + Cl2n%n = ^2 

CL n l%l + CL n 2 X 2 + ' ' ' + CL nn X n = b n 

dij G M n > 1 



Questions: 

• Is L solvable? 

• Is there a unique solution? 

• How to calculate the solutions? 

• Is there an efficient algorithm? 





5.2 Numerics of Linear Systems of Equations 
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Gaufiian Elimination Method 














CLuXi + Cli2%2 + * * * 








+ 


^1 nXn 


= h 


^22 x 2 + * * * 








+ 


^2 nXn 


= b 2 


^kkXk H - 


+ 


&kj Xj 


+ •• 


• + 


& knXn 


= b k 


dikXk “1“ 


+ 


&ij X j 


+ •• 


• + 


ttinXn 


= bi 


ttnkXk "h 


+ 


&nj Xj 


+ •• 


• + 


Q"nnXn 


bn 



The algorithm: 

for k=l , . . . ,n-l 

search a_mk with | a_mk | =max{ | a_lk | : 1 >= k } 
if a_mk=0 print "singulaer"; stop 
swap lines m and k 
for i=k+l , . . . ,n 
q_ik : =a_ik/ a_kk 
for j=k+l , . . . ,n 

a_ij : =a_ij - q_ik*a_kj 
end 

b_i:=b_i - q_ik*b_k 
end 
end 



Theorem 5.1 Complexity: The number of operations of the Gaufiian elimination for large 
n is approximately equal to |n 3 . 



Proof: 



total: 



lines columns 




k-ter step: (n — k)(n — k — 2) 



operations 

operations 



n— 1 



l:=n—k n— 1 



T(ri) = — k)(n — k + 2) ^^(/(/ + 2)) 



k= i 
n— 1 



1=1 



E /,9 n 3 n 2 n 

( l + 2/) = — — + — + n(n — 1) 



i=i 



6 



n 3 n 2 5 

= 1 n 

3 2 6 



n 



=>■ for large n: 



3 
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Example 5.9 Computer with 1 GFLOPS 



n 


T(n) 


10 

100 

1000 

10000 


1/3 • 10 3 • 10 9 sec ~ 0.3 /rsec 
1/3 • 100 3 • 10 -9 sec ~ 0.3 msec 
1/3 • 1000 3 • 10 -9 sec ~ 0.3 sec 
1/3 • 10000 3 • 10 -9 sec ~ 300 sec = 5 min 



Problems/Improvements: 

1. long computing times for large n 
• better algorithms 



T(n) = C ■ n 2 ' 38 



instead of 




• Iterative method (Gaufi-Seidel) 

2. Round-off error 

• complete pivoting 

• Gaufi-Seidel 

Applications: 

• Construction of curves through given points 

• Estimation of parameters (least squares) 

• Linear Programming 

• Computer graphics, image processing (e.g. computer tomography) 

• Numerical solving of differential equations 



5. 2. 1.2 Backward Substitution 

After n-1 elimination steps: 

A'x = b with 
Calculation of x±, . . . , x n : 



%n— 1 



A' = 



( 



u, u 

0 

0 

0 



'12 

l 22 



0 

0 



«ln \ 
/ 



a 



2 n 



bn— 1 



l n-l,n X n 



a 



n— l,n— 1 



General: 
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hi Sfc=i+ 1 a ik X k 

4 

i = n, n — 1, . . . , 1 



Runtime: 

• Divisions: n 

• Number of additions and multiplications: 

n n— 1 

_ = = 2 n ^ n “ ^ ~ 2 77,2 

=q Substitution is much faster than elimination! 

5. 2. 1.3 Backward Elimination 

A slight variant of the backward substitution is the backward elimination, where the upper 
right triangle of the matrix is being substituted similarly to the Gaufi elimination. This 
variant is called Gaufi- Jordan method. One application of this method is the computation 
of inverse matrices. 



Theorem 5.2 Correctness: The Gaufiian Method results in a unique solution (aq, . . . , x n ) 
if and only if the linerar system L has a unique solution (aq , ... , x n ). 

Proof: as exercise 



5.2.2 Iterative improvement of the solution 

Let x the calculated solution of Ax = b with the Gaufi method. In general Ax — b — r with 
r ^ 0 (r: residual vector) because of x = x + Ax. 

Ax = A(x — Ax) = b — r 

A ■ Ax = r 

With this equation the correction Ax can be calculated. better approximation for x: 

x (2) = x + Ax 



Iterative Method: 



for n = 1, 2, 3, . . .: 

r (n) = b - Ax {n) 

calculate Ax^ nach A Ax ^ 

x (n+l) = x {n) + Ax (n) 
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Note: 

1. usually (A not very ill-conditionated) very few iterations (~ 3) necessary. 

2. Solving AAx^ = is time-consuming: 0(|n 3 ). With LU decomposition (see 5.2.3) 
of A, AAx ^ = r ^ can be solved in 0(ri 2 ) steps. 

3. Must a system of equations be solved for more than one right hand side, all solutions 
will be calculated simultaneously (elimination necessary only once!) 



5.2.3 LU-Decomposition 

The Gaufiian elimination (see algorithm) multiplies row i with the factor aik/akk for 

the elimination of each element in the k-th column below the diagonal. If we write all 
calculated q % \ ; in a lower triangular matrix, in which we add ones in the diagonal, we get 



Furthermore, let 



L := 



U : = 



( 1 


0 




... 0 \ 


<721 


1 


0 




<731 


<?32 


1 










0 


\ Qnl 


Qn2 




Qnn—1 1 / 




( a ll 


a l2 


' ' ' a \n 




0 


a 22 


a 2n 


A = 










l 0 




o a! nn 



the upper triangular matrix after the elimination. 



Theorem 5.3 Then L ■ U = A holds and the solution x of the system Ax = b for any right 
hand side b can be calculated by solving the equation L ■ c = b for c and solving U ■ x = c 
for x. 



The system L-c = b is solved by forward substitution and U -x = c by backward substitution. 
Proof: We will show that L ■ U = A. Then obviously it holds 

A-x = L- U- x = b. 



Now we write L ■ U = A in detail: 





( 1 


0 


0 \ 




<721 


1 


o ; 


L-U = 


<731 


<732 


i ••. ; 








0 




\ Qnl 


<7n2 


• • • Qnn— 1 1 / 



/ 



-ni 

0 



*12 

l 22 



*ln 

4 n 



\ 



= A 
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We now apply the Gaufiian elimination on both sides and get 

1 ° \ 

■U = U 

0 l) 

Thus LU = A. Because of the associativity of matrix multiplication only L has to be elimi- 
nated on the left side. □ 

Exercise 5.1 How could you factor A into a product UL, upper triangular times lower 
triangular? Would they be the same factors as in A = LU ? 

5.2.4 Condition Analysis for Matrices 

Ax = b with A : Matrix (n x n) and x, b G M n 
What is the Norm of a matrix? 

Vector Norm: 

Definition 5.4 (p-Norm) 

Vr G M" : ||x|| p = (|:ri| p + \x 2 \ p H f \x n \ P )p 

1 < p < oo 

Theorem 5.4 ||a;||p is a norm, i.e. it has the properties: 

• Vx ^ 0 : ||a;||p > 0 ; ||rr || p = 0 x = 0 

• Va G K : ||cuc|lp = l a l ' II^IIp 

• Wx, y e M n : ||a; + y\\ p < ||x||p + ||r/|| p 

Lemma 5.1 (Holder inequality) For real numbers p, q > 1 with ^ 1 = 1 and vectors 

s,j/Gl“ we have 

||®y||i < ||®|| P ||y|| g - 



n n 

Proof: Since ||a;y||i = Xit/i < \xifji\ it remains to prove 

i — 1 i= 1 




For real numbers a,b > 0 we have (proof as exercise) 




b q 

) 

q 
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which we apply now to get 



E P® | _ 

,• ii ?< ii _ 2^ 



i= i i|:K|lp 



i= 1 



E 



1 I XiF 



Al 



|Xi| |t/i( 

11*11? ' ?llvll2 

l" i 



\x 



v- /'l M p 1 

<£(-14, + - 



A, , i 

^ 1 p . 



Xi p + 



n 



'•=i + i 



^p||*ll? ^9l|y|l2 pll*ll?fe ellyll?^’' p q 



= i 



□ 



Proof of proposition 3 in Theorem 5.4: For the cases p — 1 and p = oo see exercises. 
For 1 < p < oo: 

\xi + Vi\ p = \xi + yi\\xi + y i \ p ~ 1 < (|xi| + \yi\)\xi + yi\ p ~ l = \xt\\xi + y ^ 1 + \yi\\xi + y ^ 1 

Summation yields 



Fl+. + wl 11 < £ MN + bI p 1 + FI \v:\\ :,: i + Vi\ r '■ 



(5.1) 



2=1 



1=1 



2=1 



Application of the Holder inequality to both terms on the right hand sides gives 

1 

n ( n \ p ( n 

E \ x '\\xi +yi\ p ~ 1 < Ei^i p ( Ed** +^i p_1 ^ 



2=1 



, 2=1 



2=1 



and 




n \ p ( n 

I P 



2=1 



Eni^+^i p x - Ew) (Ed^+^r 1 

i = 1 \i=l 

what we substitute in Equation 5.1 to obtain 

n ( / n \ p / n n 

£ la* + bI” < £ |a*|" + £ Ib" 



'- 1 \Q 



2=1 



, 2=1 



, 2=1 




X! \ X i + Vi\ P 



2=1 



In the rightmost factor we used (p — 1 )q = p. Now we divide by the rightmost factor, using 
- = 1 — - and get the assertion 



Ei^+^M - Ew p + E 



. 2=1 



. 2=1 



, 2=1 



□ 



Lemma 5.2 



|x||oo := max |xj| = lim ||x|L 

1 < 2<72 p — »00 

I 1 1 ^ is called maximum norm 



In the following let ||a;|| = U^Uoo 

maximum norm: 
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Definition 5.5 For any vector norm [| || the canonical matrix norm is defined as follows: 

||Ac|| 



= max 

x^O 



x\ 



Lemma 5.3 The matrix norm is a norm and for a n x m matrix A it holds 

m 



Pll 



oo 



max > \a vl 

1 <i<n 

5 = 1 



||Arc|| < \\A\\ • ||rc| 
\\AB\\ < \\A\\ ■ \\B 



Condition of a matrix: Consequence of errors in the matrix elements of A or the right 
hand side b on errors in the solution x. 

1. Error in b: 



b 

=>■ x 

=>■ A(x + Ax) 
=>■ Ax 
=>• || A.rc || 



b + Ab 
x + Ax 
b + Ab 
A~ l Ab 

WA^AbW < p-l • ||A6|| 



b = Ax 



=> ll&ll < Pll ' Ik 




Pll 






1IA61I 

m 



llAx|| 

JM - C ' A 
m 

with Ca = ||A|| • ||A _1 



C a : condition number of A 



(. A + A A){x + Arc) 
x + Arc 

Ax 



b 

(. A + A A) _1 6 = (A + AA) _1 Arc 
((A + AA)~ l A - l)x 
(A + AA)~ 1 (A-(A + AA))x 
(A + AA)^AAx 



2. Error in A: 
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Example 5.10 





=> C A = 1001 



\\AA\\ = 0.001, \\A\\= 2.002 




5.3 Roots of Nonlinear Equations 

given: nonlinear equation f(x) = 0 
sought: solution (s) (root(s)) 

5.3.1 Approximate Values, Starting Methods 
Draw the graph of f(x), value table 
Example 5.11 

/ x \ 2 

fi x ) = ( 2 J - sina; 



Table: 
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Figure 5.2: Graph to find the start value. 



X 


{x/2f 


sin x 


f(x) 


1,6 


0,64 


0.9996 


< 0 


1.8 


0.81 


0.974 


< 0 


2.0 


1.00 


0.909 


> 0 



=>■ Root in [1.8; 2.0] 

in general: if / continuous and /(a) • f(b) < 0 f has a root in [a, b\. 

Interval bisection method 
Requirements : 

/ : [a, b\ — ► M continuous and /(a) • f(b) < 0. 

Without loss of generality /(a) < 0, f(b) > 0 (otherwise take —f(x)) 



y 




Figure 5.3: Root in the interval [a,b] can be determinated quickly by using the interval bisection 
method.. 



Algorithm: 

i + frfc-i) 

( l \ _ / {rnk,b k -i) if f{jn k ) < 0 

\ {a k -i,m k ) if f (m k )> 0 
(Root found exactly if f(m k ) = 0)! 



Theorem 5.5 Let / : [a, b\ — > M continuous with f(a)-f(b) < 0. Then the interval bisection 
method converges to a root x of /. After n steps x is determinated with a precision of 






